Joining language with data and data with data

Karsten Boye Rasmussen

doi:10.29173/iq1006

Authors

Karsten Boye Rasmussen

DOI:

https://doi.org/10.29173/iq1006

Abstract

Welcome to the IASSIST Quarterly first issue of 2021 and of volume 45 (IQ 45(1) 2021).

I always find it interesting to learn more about other research areas. Often, I find approaches in less well-known areas can be transformed and transferred to my own areas, or make me aware of problems unwisely ignored hitherto and becoming potentials. In the case of the first article, you will become aware of the connection with linguistics from a data viewpoint. As a search for datasets requires using words of a language it is obvious that linguistic knowledge can be of benefit. However, most is obvious when you think of it - afterwards. Often, it is because you did not think it through - beforehand - that new information surprises you. And this also gives you a good opportunity to thank people who are working in areas you had not thought of - before. The benefits of combining and merging types of data such as linking survey data and social media data are obvious - again! Before you start the journey of joining these types of data, the second article will provide you with valuable information gained from the experience of several projects and exemplified through cases using Twitter, Facebook, and LinkedIn.

The first article shows the support for diversity in research areas already in the title: 'A recommendation to the SSH community: take a linguist on board' authored by Jeannine Beeken of UK Data Service at University of Essex (UK). Theories and methods of linguistics are obviously relevant for data services where the search for and retrieval of data collections from vast data archives is an important step in the process towards analysis and findings in data. Beeken starts by introducing us to areas of this important retrieval step that are supported by Natural Language Processing (NLP) that increases findability, with the result that relevant descriptions of data collections are identified through online search. A simple example is that when searching for survey questions concerning 'war' the results will also include those from a search for 'armed conflict'. The development and upkeep of thesauri and language relationships is a huge and valuable task that is itself supported by linguistics and computers, for example by intelligent creation of metadata for studies. Linguistic knowledge is not only relevant for finding data but also valuable for the production of data. Computer linguistics have made great progress for the growing number of studies based on texts and data in the form of interviews. In the article, speech recognition and speech-to-text transcription is mentioned and the resulting interview transcription text can again become the subject of further computer and linguistic analysis.

The second article 'Informed consent for linking survey and social media data - differences between platforms and data types' will prepare you for benefits and obstacles when joining data from surveys and social media. The article is a part of the outcomes of several projects with participation of the authors Johannes Breuer, Tarek Al Baghal, Luke Sloan, Libby Bishop, Dimitra Kondyli, and Apostolos Linardis. The authors are based at GESIS in Germany, Essex University and Cardiff University in the UK, and EKKE in Greece. The article draws on their own projects as well as on specific other projects delivering the examples in the article.

When using self-reporting in research surveys - in this case for the study of social media - the data can prove to be unreliable. On the other hand, when research obtains data directly from the social media platforms the background, attitude, and behaviour variables for individuals are sparse compared to surveys. The obvious solution is to link such data collections. However, the linking requires informed consent. The 'joining of data' and 'informed consent' implies awareness of legal regulations - like GDPR (General Data Protection Regulation) in Europe - as well as ethical standards and guidelines of relevant institutions. The article discusses these issues and demonstrates them through three studies that used data from Twitter, Facebook, and LinkedIn. Furthermore, the regulations and setups for the social platforms have to be well scrutinized. For example, if respondents share their private data, these may also affect the privacy rights of others, and data that are collected via APIs may have special restrictions with regard to data sharing. The appendices of the article contain the full text used in the various projects for explicating the use of data and the conditions in the linking of survey and social media data. In addition to raising general awareness and giving a good overview of problems when using social media data, the article will initiate you into being well prepared, as the cases discussed include many valuable references to pursue if you plan on commencing a project linking social media data with survey data.

Enjoy the reading.

Submissions of papers for the IASSIST Quarterly are always very welcome. We welcome input from IASSIST conferences or other conferences and workshops, from local presentations or papers especially written for the IQ. When you are preparing such a presentation, give a thought to turning your one-time presentation into a lasting contribution. Doing that after the event also gives you the opportunity of improving your work after feedback. We encourage you to login or create an author profile at https://www.iassistquarterly.com (our Open Journal System application). We permit authors to have 'deep links' into the IQ as well as deposition of the paper in your local repository. Chairing a conference session or workshop with the purpose of aggregating and integrating papers for a special issue IQ is also much appreciated as the information reaches many more people than the limited number of session participants and will be readily available on the IASSIST Quarterly website at https://www.iassistquarterly.com. Authors are very welcome to take a look at the instructions and layout:

https://www.iassistquarterly.com/index.php/iassist/about/submissions

Authors can also contact me directly via e-mail: kbr@sam.sdu.dk. Should you be interested in compiling a special issue for the IQ as guest editor(s) I will also be delighted to hear from you.

Karsten Boye Rasmussen - March 2021

Joining language with data and data with data

Authors

DOI:

Abstract

Published

How to Cite

Issue

Section

License

doajseal

about

cclicense

Information

Current Issue

Make a Submission