Leiden Weibo Corpus - Home
Welcome to the home page of the Leiden Weibo Corpus, which consists of 5,103,566 messages posted on Sina Weibo in January 2012.

Sina Weibo is China's most popular microblogging service, and its 300 million users post 100+ million messages a day. This corpus was designed to make it easier to explore the wealth of data these users generate. If you're interested, you can read about how the corpus was built, see which words are most frequently used on Sina Weibo, look at a few random messages or a map representation of our data - but you can also go right ahead and use the search functionality below to start exploring.

Enjoy, and if you have any questions, suggestions for improvements, or comments, please feel free to get in touch.
Message ID
Grammar (help)
Gender Both Male Female
Lexical data
Single word
Beginning with
Ending in
Page generated in 0.00042 seconds. [Home] [About] [Help] [Open access] [Legal & privacy] [Powered by] [Contact us]