TY - GEN
T1 - Towards modelling language innovation acceptance in online social networks
AU - Kershaw, Daniel
AU - Rowe, Matthew
AU - Stacey, Patrick
N1 - Publisher Copyright:
Copyright is held by the owner/author(s).
PY - 2016/2/8
Y1 - 2016/2/8
N2 - Language change and innovation is constant in on-line and off-line communication, and has led to new words entering people's lexicon and even entering modern day dictionaries, with recent additions of 'e-cig' and 'vape'. However the manual work required to identify these 'innovations' is both time consuming and subjective. In this work we demonstrate how such innovations in language can be identified across two different OSN's (Online Social Networks) through the operationalisation of known language acceptance models that incorporate relatively simple statistical tests. From grounding our work in language theory, we identified three statistical tests that can be applied - variation in; frequency, form and meaning. Each show different success rates across the two networks (Geo-bound Twitter sample and a sample of Reddit). These tests were also applied to different community levels within the two networks allowing for different innovations to be identified across different community structures over the two networks, for instance: identifying regional variation across Twitter, and variation across groupings of Subreddits, where identified example innovations included 'casualidad' and 'cym'.
AB - Language change and innovation is constant in on-line and off-line communication, and has led to new words entering people's lexicon and even entering modern day dictionaries, with recent additions of 'e-cig' and 'vape'. However the manual work required to identify these 'innovations' is both time consuming and subjective. In this work we demonstrate how such innovations in language can be identified across two different OSN's (Online Social Networks) through the operationalisation of known language acceptance models that incorporate relatively simple statistical tests. From grounding our work in language theory, we identified three statistical tests that can be applied - variation in; frequency, form and meaning. Each show different success rates across the two networks (Geo-bound Twitter sample and a sample of Reddit). These tests were also applied to different community levels within the two networks allowing for different innovations to be identified across different community structures over the two networks, for instance: identifying regional variation across Twitter, and variation across groupings of Subreddits, where identified example innovations included 'casualidad' and 'cym'.
KW - Language change
KW - Language evolution
KW - OSN
KW - Reddit
KW - Twitter
UR - http://www.scopus.com/inward/record.url?scp=84964409597&partnerID=8YFLogxK
U2 - 10.1145/2835776.2835784
DO - 10.1145/2835776.2835784
M3 - Contribución a la conferencia
AN - SCOPUS:84964409597
T3 - WSDM 2016 - Proceedings of the 9th ACM International Conference on Web Search and Data Mining
SP - 553
EP - 562
BT - WSDM 2016 - Proceedings of the 9th ACM International Conference on Web Search and Data Mining
PB - Association for Computing Machinery, Inc
T2 - 9th ACM International Conference on Web Search and Data Mining, WSDM 2016
Y2 - 22 February 2016 through 25 February 2016
ER -