The Hautes-Alpes already gathered 22.000 SMS !

Quebec already gathered 5.000 SMS !

Switzerland just finished a 24 000 SMS collection !

Switzerland published its own Website !
  La Réunion just finished a
12 000 SMS collection





The aim of the sms4science project is to build up and study an international corpus of text messages (SMS, txt, texto, etc.). The sms4science project is coordinated by the CENTAL (Centre for Natural Language Processing), at the Université catholique de Louvain, Belgium. This coordination is financially supported by Belgacom, patron of the project. More...

You are an academic or a potential sponsor and you want to take part in this project for your country? Please contact us !


You are an individual, a research centre or a business and you want to BUY our corpus? You can directly contact the UCL Department for technology transfer (the Sopartec), and more precisely M. Frédéric OOMS

Actual 15 members of the academic network














English  French  German  Italian  Spanish  

That's new!

March 2012: SMS Communication: A Linguistic Approach (More info)

December 2011: LREC workshop : @NLP can u tag #user_generated_content ?!

September 2011:  sms project launched in Montpellier!

June 2011: The Italian and Romansh parts of Switzerland are completing their collection!


January 2011: Call for papers on SMSs! "TEXTOS : dimensions culturelles, linguistiques et pragmatiques" 79ème congrès de l’ACFAS, Université de Sherbrooke, 12-13 mai 2011.

October 2010 : sms project launched in Hautes-Alpes !

June 2010 : sms4science on Video in Switzerland.

March 2010 : Canadian project on TV!

January 2010: sms4science on RSR Radio!





 On Amazon !