The Hautes-Alpes already gathered 22.000 SMS !

Quebec already gathered 5.000 SMS !

Switzerland just finished a 24 000 SMS collection !

Switzerland published its own Website !
  La Réunion just finished a
12 000 SMS collection





The aim of the sms4science project is to build up and study an international corpus of text messages (SMS, txt, texto, etc.). The sms4science project is coordinated by the CENTAL (Centre for Natural Language Processing), at the Université catholique de Louvain, Belgium. This coordination is financially supported by Belgacom, patron of the project. More...


You want to obtain or buy the corpus: please contact us!

Actual 15 members of the academic network














English  French  German  Italian  Spanish  

That's new!

2016: The corpus collection is over. Buy the sms corpus

2014: SMS Communication: A Linguistic Approach is out! Buy the book

March 2012: SMS Communication: A Linguistic Approach (More info)

December 2011: LREC workshop : @NLP can u tag #user_generated_content ?!

September 2011:  sms project launched in Montpellier!

June 2011: The Italian and Romansh parts of Switzerland are completing their collection!


January 2011: Call for papers on SMSs! "TEXTOS : dimensions culturelles, linguistiques et pragmatiques" 79ème congrès de l’ACFAS, Université de Sherbrooke, 12-13 mai 2011.

October 2010 : sms project launched in Hautes-Alpes !

June 2010 : sms4science on Video in Switzerland.

March 2010 : Canadian project on TV!

January 2010: sms4science on RSR Radio!





 On Amazon !