Comparison of the most popular Czech and German lexemes in the global Internet search engine Google (2015)

La comparación de los lexemas checos y españoles más buscados por medio del buscador web Google (2015)

Dana Gálová (Institute of Technology and Business in České Budějovice)

Artículo recibido: 05-09-2017 | Artículo aceptado: 23-10-2017

RESUMEN: Google, como el buscador más utilizado por el mundo, representa la mayor base de datos de casi todos los idiomas del mundo publica anualmente las estadísticas de las palabras más buscadas en internet, lo que indica los intereses y preferencias de la populación en un año determinado. El artículo tiene como objetivo analizar y comparar los lexemas más buscados en la versión checa y alemana del buscador Google en 2015. La clasificación de los lexemas analizados por campos temáticos permitió definir las similitudes y diferencias existentes y así comparar los intereses y preferencias de los checos y alemanes. El artículo incluye también la interpretación de las diferencias reveladas y la reflexión sobre las causas del grado diferente del compromiso del público checo y alemán en torno a los problemas de la sociedad.
ABSTRACT: Most widely used search engine Google represents the most current language databank in the world. It publishes annually the statistics of the most popular words that indicate interest preferences of a particular population in a given year. The aim of this paper is a comparative analysis of the most popular lexemes in Czech and German versions of search engine Google in 2015. Lexeme segmentation into different thematic fields enabled the identification of correspondences and differences and thus the comparison of interest preferences of Czech and German population. The paper also offers mutual interpretation of differences and reflection on the causes of different extents of civic engagement in both populations.

PALABRAS CLAVE: Google, lexema, comparación, compromiso del público, buscador
KEY WORDS: Google, lexeme, comparison, community involvement, search engine


The scientific study was created during the project No. 201609 of the Internal Grant Competition at the Institute of Technology and Business in České Budějovice.


1. Introduction

The development of the computer technology has allowed Internet users to find a plenty of information, and to work with various data in a previously unthinkable extent. Internet containing hundreds of billions words represents the biggest language corpus:

I can understand the language corpus internally structured, unified and usually indexed extensive and comprehensive set of electronically stored and processed linguistic data mostly in a text form, organized with a view to use for a target against which it is to be considered representative. (Čermák, 1995: 119)

It’s ever expanding and changing in the world, compiled by millions of users based on their different needs and interests (Bickel, 2006). All language data in the search engines can also be found in their natural context form, allowing their widespread use not only in linguistics (especially in lexicography, contrastive linguistics, translatology, discourse analysis, etc..), but also in sociology, psychology and other sciences (Chlumská, 2014). From the linguistic point of view, Web corpus represents the largest and most accessible source of linguistic material whose biggest advantage is the speed, efficiency and ease of data workability (Cvrček & Kovaříková, 2011).

A trend of using traditional corpora and the Web as a corpus is especially noticeable in recent decades. Due to the rapid development of information technology, corpus linguistics began to develop as an independent linguistic discipline, which allowed the ever closer links between the linguistics and empirical descriptive research (Chlumská, 2014). In recent years, in various languages many corpora (e.g. Brown Corpus, British National Corpus, SYN2000, The Freiburg-LOB corpus of American English, Deutsche Referenzkorpus) were created and it is expected that their number will grow rapidly in the future. This will enable further innovative linguistic research (e.g. identifying of the frequency of words in texts, searching for words in a remote context of the word, examination of semantic prosody, identifying multi-word units etc.). Cvrček and Kováříková note that its

frequency is for the language investigating very substantial, in particular by providing information about the center and periphery of linguistic phenomena according to which a language description should be structured so that there is not a detailed description of marginalia while missing a vast area of linguistic phenomena. (2011: 116)

While traditional language corpora are always focused on the specific language, Web corpus allows you to search the most frequently used words in different countries using Internet domains. Most widely used search engine Google representing the most current language databank of the world statistics publishes annually the most popular words that indicate interest preferences of a particular population in a given year.

The aim of this paper is the analysis and subsequent comparison of the most popular Czech and German semantic search engine Google in 2015.  Not only the common state borders bring together the Czech Republic and the Federal Republic of Germany, but also a similar history and cultural environment similar lifestyle. So it can be assumed that the information that Czech and German Internet users seek out most will be also similar.

This contrastive linguistic study can be an interesting contribution to sociological surveys studying the national identity of these two populations.

2. Methods

The subject of the analysis and subsequent comparison are ten Czech and German lexemes that were in the Czech Republic and the Federal Republic of Germany sought in 2015 in the Google search engine. These words were taken from the official statistics, which have been annually published on its website by Google through Zeitgeist (literally «spirit of the times») since 2001. For the purposes of examination, a method of content analysis (themes) was chosen, a lexeme was set a basic analytical unit. Based on the description of semantic twenty-two rankings were in keeping with the basic criteria (Merten, 1995) identified the following five thematic categories: 1 / politics, society, 2 / leisure, entertainment, 3 / finance, 4 / information technology 5 / natural phenomena[1]. The subsequent comparison of individual thematic fields allowed the identification of similarities and differences of interpretation and their possible causes.

3. Results and discussion

Tables 1 and 2 show the order and description of the most popular German and Czech lexemes via the search engine Google in the year. 2015.

The most popular lexemes in the Federal Republic of Germany
Sonnenfinsternis / Solar eclipse Natural phenomenon of 20 March 2015, became for German Internet users the most popular lexeme ever. Along with «Windows 10» is the only word that appeared identically in both German and Czech rankings.
Pegida The citizens’ initiative Pegida (Patriotische Europäergegen die Islamisierung des Abendlandes) formed in October 2014 in Dresden calls for tougher conditions for the admission of immigrants and the prevailing direct democracy (the main enemy of this movement are foreigners, politics and the media).[2]
Flugzeugabsturz / Fall of Airbus The fall of the Airbus A320 of the German airline Germanwings, which flew on 24 March 2015 from Barcelona to Duesseldorf, did not survive anyone of the 150 passengers (among the victims were 67 Germans, including 16 schoolchildren).
Dschungelcamp A popular reality show aired in Germany since 2004.
Paris Terrorist attacks by Islamic State to the French capital in the night from 13 to 14 November 2015 claimed a total of 130 deaths.
iPhone Top product, which Apple introduced a new generation of smartphones.
Griechenland / Greece A solution to the Greek crisis, which has its origins in the 90s, brought in 2015, intensive discussions between representatives of Greece and the European Monetary Union, which resulted in the acceptance of the demands of international creditors by the Greek party.
Charlie Hebdo The attack on the editors of the satirical magazine Charlie Hebdo, which is among other things known for caricatures of Islam, occurred on January 7, 2015 in Paris. It claimed 12 dead and 10 injured.
Helmut Schmidt One of the most prominent politicians of postwar Germany, a Social Democrat and a former Chancellor of Germany, died on 10 November 2015.
Windows 10 The latest version of Microsoft’s operating system for personal computers.

Tab. 1: Overview of the most popular German lexemes in 2015.

The most popular lexemes in the Czech Republic
Akcie ČEZ/CEZ shares currently The reason for this surprisingly large interest in this phrase was probably a huge drop in the shares of energy company in the year 2015.
Agario Multiplayer online game, which became a phenomenon soon after its release 28 April 2015.
MS hokej 2015 / WCH Hockey 2015 World Ice Hockey Championships was held on 1 – 17 May 2016 in Prague and Ostrava.
Pixwords Difficult crossword puzzle with images as one of the best Android games verifies vocabulary and sense of imagination or detail.
50 odstínů šedi / 50 shades of Grey American erotic film drama, whose launch was preceded by a large advertising campaign in the media.
Výměna manželek / Wife Swap Originally a British reality show (in the Czech Republic presented since 2013).
Velikonoce 2015/ Easter Important Christian holiday.
Windows 10 The latest version of Microsoft’s operating system for personal computers.
Eurojackpot The most popular European lottery (also available in the online version).
Solar eclipse Natural phenomenon of 20 March 2015. Along with «Windows 10» is the only word that appeared identically in both German and Czech rankings.

Tab. 2: Overview of the most popular Czech lexemes in 2015.

Description of twenty lexemes listed in Tables 1 and 2 led to the identification of five following thematic categories:

1/ Politics, society

2/ Leisure, entertainment

3/ Finance

4/ Information technology

5/ Natural phenomena

Individual survey units (lexemes) were assigned to a particular category and then statistically evaluated (see Chart 1).

From the graph, it is apparent at first glance that the thematic structure of both rankings is completely different. The most striking difference was recorded in political and social life, which is clearly the most important issue for the German population. Six out of ten lexemes covered the major pan-European social phenomena (Flugzeugabsturz / fall of airbus, Paris, Griechenland / Greece, Charlie Hebdo), two to domestic political events (Pegida, Helmut Schmidt). The two terms refer to the field of modern information technology (iPhone, Windows 10), one to TV entertainment (Dschungelcamp) and one to natural phenomena (Sonnenfinsternis / solar eclipse).  In the Czech ranking, lexemes of entertainment and sport clearly dominated (výměna manželek/ Wife Swap, ms v hokeji WCH in Hockey, 50 odstínů šedi / 50 shades of Grey, velikonoce 2015/ Easter 2015, agario, pixwords), then follow lexemes from the category finance (akcie ČEZ /ČEZshares , eurojackpot), IT (windows 10) and natural phenomena (zatmění slunce / solar eclipse).

Chart 1 Comparison of thematic categories according to Czech and German version in Google from 2015.
Chart 1 Comparison of thematic categories according to Czech and German version in Google from 2015.

The results clearly show that:

Germans are most interested in social and political issues (six out of ten lexemes).

Czechs are not engaged socially and politically, but they are dependent primarily on television and online games (also six of the ten lexemes).

The only common topic of Czechs and Germans is the area of information technology and natural phenomena (two lexemes).

This comparison did not confirm the original hypothesis that the search terms will be broadly similar. It can be assumed that causes of different perspective of Czechs and Germans of the world around them and of themselves are more (different historical development of both countries, greater participation of Germans in social and political affairs, differentiation value system, Germans more concerned with their own safety as well as social security – especially in the current context of the economic crisis in Greece and the refugee crisis, etc.) Comparison of both rankings, however, clearly indicates that the fundamental difference between Czechs and Germans is mainly different degree of public involvement.

The notion of community involvement has become in recent decades a permanent part of the public and professional discourse. Community involvement as one of the main attributes of a modern developed society means primarily how much the citizens of a given country are involved in public life (public activities), how interested in politics they are, how they are identified with the democratic system, to what extent and in which areas they are voluntarily engaged. A common feature is a voluntary commitment, incorruptibility, orientation to the common good and the public (Daphi, 2010).

Community involvement in the Czech Republic:

In 2015 in the Czech Republic was carried out comprehensive unique research conducted by the Centre of Civic Education Faculty of Humanities in cooperation with Konrad Adenauer Foundation and TNS Aisa, which brought a number of interesting findings in the field of public involvement in the Czech society[3].

Research results:

  1. The Czech population is very heterogeneous both in terms of public involvement, as well as faith in their abilities and attitudes towards the future.
  2. Only a small group of people (7%) strongly socially involved in many issues at all levels.
  3. If the Czechs are civically engaged, it is mainly in NONPOLITICAL activities (support for disabled and sick, and supporting people in difficult situations – 35% support for people affected by the disaster – 27%, leisure activities – 24%, the environment – 22%). The most common form of support is yet cash contribution, material gift, petitions, the least are people involved in protests and strikes
  4. Political activities are marginal topic for most of the Czech population-in favor of «democracy and human rights» is committed (8.8%), the right to co-decide – referendums, public debates – (7.3%) or the European Union (1, 8%).
  5. A small participation of Czech citizens in social life is not always an expression of their laziness, but rather a social and cultural problem

The fact that in the Czech Republic, most of the population gets engaged in civic activities very little or not at all, is according to the authors of this research (Matějka et al, 2015), a sign of instability, when most activities usurps a small group of citizens and consensual model of society is threatened. Research also confirmed the existence of a close relationship between trust in democratic institutions and commitment and it emphasized the need for civic education.

Similar results are also brought in the research by the STEM agency conducted approximately every four years. According to them, most Czechs believe that it is better not to engage in politics in order not to burn one’s fingers. People who are more worried about the security risks of political involvement, – according to these surveys – are also more likely to feel helpless and often perceive politics as a sphere where there are promoted only the interests of individual politicians The elderly and people with basic education or apprenticeship often associate activity in politics with danger.

Community involvement in Germany:

Involvement of Germans in civic activities is considerable and Germany in many respects can serve as a model for an engaged civil society. J. Hošek (2011) states that the Germans are convinced that they can affect public issues significantly themselves. They behave genuine respect for the democratic establishment, carefully watching the political scene and they themselves are very active participants in political life. They are very aware and committed, as evidenced by their membership in many foundations, associations, charitable organizations, political parties, trade unions, non-governmental organizations, but also by their personal involvement in many demonstrations, signature events, street happenings. The country has about 21 000 foundations (i.e. in the federal average of 26 foundations for 100 000 inhabitants), which together emit about € 17 billion annually to charitable purposes (social welfare, education, science and culture) < https://www.tatsachen-ueber-deutschland.de/de/rubriken/gesellschaft/engagierte-zivilgesellschaft>. In the last twenty years in Germany created 400 civil foundations supporting local and regional plans. Their importance has increased, especially in the context of the refugee crisis, in which more than half of them is engaged. According to a study by market research GfKBilanz des Helfens, Germans donated in 2015, a total of €5.5 billions o for church and humanitarian aid. According to the latest survey of the Federal Ministry for Family Affairs from 2014 (BMFSFJ) in Germany there are about 31 millions of Germans involved as volunteers, which is ten percent more than 15 years ago. Great potential for voluntary involvement is represented by young people. Of a total of 615 000 civil society organizations is one in six focused on working with children and youth (ZiViZ, 2013). Significant differentiation in the of public involvement of young Czechs and Germans confirmed the results of a unique survey of the OECD, which was focused on the comparison of 40 industrialized countries (the observed sample, the age group 15-29 years). Germany in this comparison occupied the second place (behind Denmark), while the Czech Republic ended in the penultimate 39th place.

4. Conclusion

The presented research is based on the use of the web as a source of corpus linguistics (sociolinguistics) examination. The aim of this study was to compare the most popular Czech and German lexemes. The comparison of various thematic categories clearly demonstrated a marked difference between interest preferences of the Czech and German population. Compared with similar sociological studies, it was later confirmed a completely different level of public involvement of Czechs and Germans and it is likely that this trend can be expected in the future. It is clear that many years of repression of civil society in the Czech Republic (and in other post-communist countries) has raised in many people fear and reluctance to engage. The Czech Republic will probably still time to deal with the deficits of their democratic system (highly corrupt, unstable party systems, low trust in politics, low voter turnout, skepticism, reluctance to engage etc.), Which is a precondition for the creation of a truly mature functioning engaged civil society (Heydemann & Vodička, 2013).

The issue of relations between the most frequent lexemes in Internet search engines and interest preferences of their users (populations) as a topical issue sociolinguistic research may become the subject of further replicated research.

5. Works Cited

Bickel, Hans (2006). “Das Internet alslingvistisches Korpus”.  Lingvistik online. <https://bop.unibe.ch/linguistik-online/article/view/612/1053>. (2-10-2016).

Cvrček, Václav & Dominika Kováříková (2011). “Možnosti a meze korpusové lingvistiky”.Naše řeč 94: pp 113–133.

Čermák, František (1995). “Jazykový korpus: Prostředek a zdroj poznání”. Slovo a slovesnost 56: pp 119–140.

Daphi, Priska et al. (eds.) (2010). EngagierteMenschen: vierFallstudien. Berlin: Maecenata Institut für Philanthropie und Zivilgesellschaftan der Humboldt-Universität zu Berlin.

GfK (2015). GfK Verein Jahresbericht 2014_2015. <http://www.gfk-verein.org/search/site/bilanz%2520des%2520helfens%25202015%2520jahresbericht>. (18-11-2016).

Hošek, Jiří (2011). Německo v přímém přenosu: naši sousedé včera a dnes. Praha: Brána.

Heydemann, Günther & Karel Vodička (2013). VomOstblockzur EU. Systemtransformationen 1990–2012 imVergleich. Göttingen: Vandenhoeck&Ruprecht.

Chlumská, Lucie(2014). “Není korpus jako korpus: korpusy v kontrastivní lingvistice a translatologii”. Časopis pro moderní filologii 96: pp. 221-232.

Matějka, Ondřej, Jan Krajhanzl, Tomáš Protivínský & Barbora Bakošová (2015). “Občanská angažovanost 2015. Mapa české společnosti z hlediska občanské angažovanosti”. Obcanskevzdelavani.cz. <http://www.obcanskevzdelavani.cz/download/661/COV_Obcanska-angazovanost-2015_zaverecna-zprava.pdf>. (5-10-2016).

Merten, Klaus (1995). Inhaltsanalyse: Einführung in Theorie, Methode und Praxis. Wiesbaden: Springer Fachmedien GmbH.

Bundesministeriumfür Senioren, Frauenund Jugend  (n.d.). ImmermehrMenschenengagierensichehrenamtlich. <https://www.bmfsfj.de/bmfsfj/aktuelles/alle-meldungen/immer-mehr-menschen-engagieren-sich-ehrenamtlich/109030? view=DEFAULT>. (22-9-2016).

STEM. Postoje našich občanů k politické aktivitě a k politikům. <https://www.stem.cz/postoje-nasich-obcanu-k-politicke-aktivite-a-k-politikum/>. (22-9-2016).

OECD (2016). Not unterestedforpolitics?<http://www.oecd-ilibrary.org/sites/9789264261488-en/07/03/index.html? itemId=/content/chapter/soc_glance-2016-28-en& mimeType=text/html/>.(23-10-016).

Tatsachenüber Deutschland (n.d.). EngagierteZivilgesellschaft. Frankfurt amMain: FrankfurterSocietäts-MedienGmbH, <https://www.tatsachen-ueber-deutschland.de/de/rubriken/ gesellschaft/engagierte-zivilgesellschaft>. (17-11-2016).

Vorländer, H., M. Herold a S. Schäller (2015, 19 October). „WasistPegidaundwarum?“FrankfurterallgemeineZeitung.<https://tu-dresden.de/gsw/phil/powi/poltheo/ressourcen/ dateien/download/dokumente_pegida/FAZ-Bericht-vom-19.10.2015.pdf?lang=de>. (23-9-2016).

Zivilgesellschaftfürjunge Menschen (n.d.) .Zivilgesellschaft in Zahlen.Gütersloh: BertelsmannStiftung, 1/2013, str. 2. [cit. 2016-09-28]. <http://www.ziviz.info/fileadmin/ZiviZ-Praxis/Konkret_1.pdf>. (23-11-2016).

Caracteres vol.6 n2

· Descargar el vol.6 nº2 de Caracteres como PDF.

· Descargar este texto como PDF.

· Regresar al índice de la edición web.

Notas:    (↵ regresa al texto)

  1. Category 1 – politics, society: Into the category of politics, society falls each lexeme relating to social or political phenomenon (e.g. the terrorist attacks, political movement, prominent politicians, aviation disasters, economic crisis). The category 2 – leisure, entertainment includes lexemes related to the topic of television, film, sports, holidays, games. Category 3 – finance includes all lexemes in finance, e.g. shares, cash prizes. Category 4 – Information technology topics such as operating systems, mobile phones. Category 5 – natural phenomena. This category represents only one phenomenon occurring in both charts = solar eclipse.
  2. Source: Frankfurterallgemeine Zeitung. Frankfurt am Main: 19/10/2015
  3. Data were collected at the turn of 2014 and 2015 with 3,876 respondents aged 15-65 years.

Caracteres. Estudios culturales y críticos de la esfera digital | ISSN: 2254-4496 | Salamanca