As all other types of visualization, linguistic mapping has two main goals: data presentation and data analysis. The most common purpose for which linguistic maps are used, is simply pointing to the location of one or more languages of interest (presentation). A more sophisticated task is showing the distribution of particular linguistic features or their combination among languages of a certain area (presentation and analysis). There are three linguistic subdisciplines that use maps for visualization: linguistic typology, areal linguistics and dialectology....
On 21-22 April, the London School of Economics hosted the Text Analysis Package Developers’ Workshop, a two-day event held in London that brought together developers of R packages for working with text and text-related data. This included a wide range of applications, including string handling (stringi) and tokenization (the rOpenSci-onboarded tokenizers, KoNLP), corpus and text processing (readtext, tm, quanteda, and qdap), natural language processing (NLP) such as part of speech and dependency tagging (cleanNLP, spacyr), and the statistical analysis of textual data (stm, text2vec, and koRpus) – although this list is hardly complete....
You can find members of the rOpenSci team at various meetings and workshops around the world. Come say ‘hi’, learn about how our packages can enable your research, or about our onboarding process for contributing new packages, discuss software sustainability or tell us how we can help you do open and reproducible research.
...There’s a lot of work that goes in to making software: the code that does the thing itself, unit testing, examples, tutorials, documentation, and support. rOpenSci software is created and maintained both by our staff and by our (awesome) community. In keeping with our aim to build capacity of software users and developers, three interns from our academic home at UC Berkeley are now working with us as well. Our interns are mentored by Carl Boettiger, Scott Chamberlain, and Karthik Ram and they will receive academic credit and/or pay for their work....
There is no problem in science quite as frustrating as other peoples’ data. Whether it’s malformed spreadsheets, disorganized documents, proprietary file formats, data without metadata, or any other data scenario created by someone else, scientists have taken to Twitter to complain about it. As a political scientist who regularly encounters so-called “open data” in PDFs, this problem is particularly irritating. PDFs may have “portable” in their name, making them display consistently on various platforms, but that portability means any information contained in a PDF is irritatingly difficult to extract computationally....