3. Digital Humanities and languages for specific purposes
Сайт: | Открытые курсы ИРНИТУ |
Курс: | Digital Humanities |
Книга: | 3. Digital Humanities and languages for specific purposes |
Напечатано:: | Гость |
Дата: | Суббота, 11 Октябрь 2025, 03:01 |
1. Language for specific purposes
Language for Specific Purposes (LSP) refers to a specialized area within applied linguistics that focuses on the specific linguistic needs of particular professional or academic groups, distinct from general language use. Here are some key aspects that define LSP:
1) Specialized Vocabulary and Structures: LSP involves language that contains terminology, phrases, and grammatical structures specific to a particular field, such as medicine, law, engineering, business, or academic research.
Examples of language specificity in different professional fields:
Academic Writing
- Complex sentences with multiple clauses to convey detailed information and relationships between ideas.
- Passive voice to focus on the action or result rather than the actor, especially in scientific writing.
- Nominalization, where verbs or adjectives are converted into nouns to create a more formal tone.
Legal Language
- Highly complex sentence structures, often with multiple nested clauses.
- Modal verbs (such as shall, must, may) to indicate obligations, rights, or possibilities.
- Precise and archaic vocabulary, sometimes leading to long and dense noun phrases:
Herein, Hereof, Hereunder, Hereto – these words are used to refer to matters within the document.
Said, Such, Aforementioned – these are often used instead of simpler pronouns or references to previously mentioned subjects.
Shall, Hereby, Wherein – used to express obligation or to refer to specific sections of a document.
Witnesseth, Know all men by these presents – phrases often found in the opening of legal documents, although they are becoming less common.
Latin is still used for certain legal terms and principles, such as "habeas corpus" (you shall have the body), "pro bono" (for the public good), "in loco parentis" (in the place of a parent), and "prima facie" (at first sight).
Business Communication
- Direct and clear sentence structures in business correspondence to convey messages efficiently.
- Conditional structures (if-then statements) for proposals, negotiations, or discussing potential scenarios.
- Imperatives for instructions or directives, especially in memos or emails.
Technical Writing
- Sequential and procedural grammar structures for instructions and manuals (e.g., first, next, finally).
- Present simple tense for universal truths and instructions.
- Infinitives and gerunds for instructions and guidelines.
Journalistic Writing
- Inverted pyramid structure, starting with the most newsworthy information (who, what, when, where, why, and how) and then adding details.
- Active voice for a more immediate and engaging tone.
- Short sentences and paragraphs for readability and emphasis.
Medical Communication
- Specialized terminology often formed from Latin or Greek roots.
- Passive structures when describing procedures or patient conditions.
- Use of conditionals for discussing diagnoses, treatments, and patient care scenarios.
Scientific Research Papers
- Use of the present tense to discuss established knowledge or general truths.
- Past tense for describing specific methods, experiments, and results.
- Passive voice to emphasize the process or findings over the researcher's actions.
2) Purpose-Driven Communication: The use of language in LSP is closely tied to specific objectives and tasks relevant to a particular professional or academic field. The language is tailored to efficiently and accurately convey information pertinent to these objectives.
Linguistic characteristics of purpose-driven communication:
Clarity and Precision: Language is used in a clear, precise manner to avoid ambiguity. This is especially important in fields like law, science, and business, where misunderstandings can have significant consequences.
Conciseness: Communication is often concise, with a focus on delivering information efficiently. Superfluous details are avoided to maintain the focus on the core message or objective.
Formality and Professionalism: In many contexts, especially professional ones, a formal tone is used to convey seriousness and respect. This includes the use of professional jargon or technical terms relevant to the field.
Objective Tone: Emotional language is typically minimized in favor of an objective, factual style. This helps in maintaining professionalism and ensuring that the information is received and interpreted based on its merits.
Jargon and Technical Language: Specialized vocabulary or jargon pertinent to the specific field or audience is commonly used. This allows for precise, succinct communication among experts but can be confusing for outsiders.
Target Audience Consideration: The language is tailored to the knowledge level and interests of the intended audience. For a general audience, technical terms might be explained or simplified, while for a specialized audience, more complex terminology and concepts might be used directly.
Structured and Organized: Purpose-driven communication is often well-structured, with a clear beginning, middle, and end. This structure aids in guiding the audience through the information in a logical, coherent manner.
Call to Action: In many cases, such as marketing or advocacy, the language includes a clear call to action, guiding the audience towards a desired response or behaviour.
Persuasive Elements: Depending on the purpose, persuasive language might be used to influence the audience's beliefs, attitudes, or actions. This includes the use of rhetorical devices, compelling arguments, and motivational appeals.
Cultural Sensitivity: The language is often crafted with an awareness of cultural norms and expectations, especially in international or multicultural contexts.
3) Audience and Context Awareness: LSP is characterized by its focus on the needs and background of its audience, which is usually composed of professionals or academics within a specific field. This awareness influences the choice of vocabulary, level of technicality, and mode of communication.
4) Pragmatic and Functional Approach: LSP emphasizes the pragmatic and functional use of language. It focuses on enabling effective communication within a specific domain, often prioritizing clarity, precision, and efficiency over stylistic elements of language.
5) Interdisciplinary Nature: LSP often intersects with various disciplines, requiring an understanding of both linguistic principles and the specific knowledge domain it serves.
6) Cultural and Contextual Sensitivity: LSP is sensitive to the cultural and situational contexts in which the language is used, acknowledging that different fields may have unique communicative conventions and expectations.
7) Dynamic and Evolving: As fields of study and industries evolve, so does the LSP associated with them. It is a dynamic area of study, constantly adapting to new developments, terminologies, and communication needs of specific sectors.
2. Terms as a special group of lexis and their role in LSP
A "term" is a word or a phrase used in a specific context, particularly within a specialized field, to denote a very precise concept or object. The key characteristics of a term include its specialized nature, its use within a particular domain (like medicine, law, technology, etc.), and its role in conveying specific, often complex, ideas within that domain.
The etymology of the word "term" traces back to Latin and earlier roots:
Latin Origins: The word "term" comes from the Latin word "terminus," which means "boundary," "end," "limit," or "goal." The original sense of "terminus" was quite literal, often referring to a physical boundary marker, such as a stone or pillar that marked the end of a property line or road.
Old French Influence: The word entered Middle English through Old French, where it was spelled "terme." In this phase, the word began to develop more abstract meanings, extending from physical boundaries to more conceptual limits, such as periods of time or the limits of an agreement.
Shift in Meaning: Over time, in English, the word "term" evolved to include its current meanings, such as a word or phrase used in a specific context, especially in specialized fields (like legal or academic terms), or a duration of time (as in academic terms or terms of office).
The evolution from a physical boundary marker to a more abstract concept of defining or limiting something reflects a common linguistic phenomenon where concrete meanings extend to more metaphorical or abstract applications. This historical development of the word "term" mirrors its function in language, as it serves to delineate and define concepts and ideas within specific contexts.
Terms are normally opposed to common words of general language vocabulary. The words used by all speakers of a language and those used specifically within certain professional groups have distinct classifications:
General Vocabulary (or Common Vocabulary): These are words that are known and used by all speakers of a language. They form the basic, everyday language used in ordinary conversation and writing. General vocabulary includes words for common objects, actions, and ideas, and is not specialized or technical. It's accessible to people of all ages and backgrounds and is essential for basic communication in any language.
Technical Vocabulary (or Jargon): These are words or expressions used primarily by professionals within a specific field or industry. Jargon is specialized language that may not be understood by people outside that particular field. It includes specific terms, acronyms, and phrases that efficiently convey complex ideas or processes within that profession. For instance, legal jargon includes terms like "habeas corpus" or "amicus curiae," which are specific to the legal field.
Sublanguage or Lingo: This term can also be used to refer to the specialized language used by a particular group, profession, or hobbyist community. It encompasses the jargon and specific expressions that are characteristic of that group's activities and interests.
In linguistic studies, scholars sometimes differentiate between terms and jargon. They are both related to specialized language use, but they differ in their scope, context, and sometimes in their accessibility to non-specialists. Understanding these differences is important for effective communication, particularly in professional or technical contexts.
Terms (Terminology)
Specificity and Precision: Terms are specific words or phrases that have a precise meaning in a particular field or discipline. They are used to describe concepts, processes, or objects that are unique to that field.
Standardization: Terms are often standardized within a field, especially in professional, academic, or technical disciplines. This standardization is important for clarity and consistency in communication.
Accessibility: Terms can be accessible to a wider audience, particularly if they are explained or if the context provides sufficient information. Many terms eventually become part of the general vocabulary as the concepts they represent become more widely known.
Examples: "Photosynthesis" in biology, "habeas corpus" in law, or "amortization" in finance.
Jargon
Professional or Group Language: Jargon refers to the specialized language used by a specific professional, occupational, or other group. It includes terms but can also encompass slang, abbreviations, acronyms, and idiomatic expressions unique to that group.
Insider Language: Jargon is often seen as 'insider' language, understood by members of a particular group but potentially confusing or inaccessible to outsiders. It can create a sense of community or exclusivity among those who understand it.
Function: While it efficiently conveys ideas within the group, jargon can be problematic in broader communication due to its lack of clarity for the uninitiated.
Examples: Medical jargon like "stat" (immediately), tech industry jargon like "dogfooding" (using one's own product), or business jargon like "blue sky thinking" (creative, unimpeded ideas).
Another view differs between terms and non-terms. Traditional studies give the following reasons for this opposition:
1) Specialization and Context-Specificity: Terms are often specific to a particular field or subject area and might not be used or understood outside of that context. Common words, on the other hand, are used in everyday language and are generally understood by most speakers of a language.
2) Precision and Unambiguity: Terms are usually defined very precisely within their field. They are intended to convey an exact meaning with little room for ambiguity, which is crucial in technical or specialized communication. Common words can have multiple meanings and can be more ambiguous.
3) Stability Over Time: Terms often have a more stable meaning within their field, whereas the meanings of common words can change more fluidly and frequently over time.
4) Formal vs. Informal Use: Terms are typically used in formal, professional, or academic contexts, whereas common words are used in both formal and informal situations.
5) Requirement for Explanation: Terms may require explanation or definition when used outside their specific field, whereas common words are generally understood without additional context.
6) Creation Process: New terms are often created deliberately through a process of standardization, especially in fast-evolving fields like technology and science. Common words, however, often evolve naturally through everyday language use.
7) Cultural and Linguistic Variations: While common words are integral to the everyday language and culture of a broad group of speakers, terms are more focused on the culture and language of a specific professional or academic community.
In summary, while common words form the basis of everyday language and communication, terms are specialized expressions used within specific fields to convey precise and complex concepts with clarity and accuracy.
However, the problem of terms is quite a complex issue, and it has several aspects: linguistic (terms are words), epistemological (terms represent and structure knowledge), practical (terms refer to specific objects or concepts that we employ, manipulate, or use for our practical needs). This results in a multidisciplinary nature of terminological studies, and this approach has a long-standing tradition.
A good example of this can be Antoine Lavoisier, a renowned French chemist of the 18th century often referred to as the "Father of Modern Chemistry," highlighted the importance of precision in scientific language. One of his most famous quotes regarding terms and language is:
"To develop the spirit of precision in chemistry, we must first make the language precise."
This quote underscores Lavoisier's belief in the crucial role of clear and precise terminology in scientific discourse. He recognized that in order to advance in scientific understanding and effectively communicate scientific ideas, the language used must be accurate and unambiguous. This perspective was particularly significant in his time, as chemistry was transitioning from a qualitative to a quantitative science.
Lavoisier's emphasis on precise language was part of his broader efforts to systematize and professionalize chemistry. He introduced a methodical approach to chemical nomenclature, aimed at providing a clear and consistent way to name chemical compounds. This was a significant departure from the alchemical traditions, which often used obscure and inconsistent terms. By standardizing chemical terminology, Lavoisier made it easier for scientists to communicate their findings and build upon each other's work, which was pivotal in the development of modern chemistry.
This approach demonstrates that terminology and its domain cannot be studied separately, as they unite the facts this domain deals with, the ideas and theories about these facts, and the vocabulary to express these ideas.
In spite of this special status, linguists apply traditional methods of vocabulary studies to terms as language structures.
From a structural point of view, terms, as well as general vocabulary units, can be analysed based on their composition, formation, and relationship within language. Here are some key structural aspects of terms:
Simple Terms
These are terms consisting of a single lexical unit (word).
Examples include "atom" in physics or "aorta" in anatomy.
Compound Terms
Formed by combining two or more words, often to describe a new concept or a more specific aspect of a field.
They can be written as one word (e.g., "keyboard"), hyphenated (e.g., "x-ray"), or as separate words (e.g., "blood pressure").
Examples include "hard drive" in computing or "power of attorney" in law.
Complex Terms (or Phrasal Terms)
These are phrases consisting of multiple words that together form a single concept.
They are more than just a sum of their parts and often have a specific meaning in a field that might not be immediately apparent from the individual words.
Examples include "habeas corpus" in law or "natural selection" in biology.
Abbreviations and Acronyms
Abbreviations are shortened forms of terms, like "DNA" for Deoxyribonucleic Acid.
Acronyms are a type of abbreviation formed from the initial letters of a phrase and pronounced as a word, like "laser" (Light Amplification by Stimulated Emission of Radiation).
Neologisms
These are newly coined terms, often created to describe new inventions, concepts, or phenomena.
Neologisms can be completely new words, existing words used in a new way, or new combinations of existing words.
Borrowed Terms
Many terms are borrowed from other languages and may retain their foreign spelling and pronunciation.
For instance, many medical terms are borrowed from Latin or Greek.
Derivative Terms
These are terms formed by adding prefixes and suffixes to change the meaning of a base word.
For example, "unemployment" is derived by adding the prefix "un-" to "employment".
Syntactic Structure
In complex or phrasal terms, the syntactic structure (how the words are put together) can be important in conveying the precise meaning.
For instance, "magnetic resonance imaging" has the following structure: adjective + noun + noun; the main word in this phrase is the last noun (imaging), it nominates a test in medical diagnostics. The noun in preposition (resonance) has a function of an attribute, it is a subordinate describing the quality of a medical test? additional quality represents the adjective (magnetic). These functional characteristics constitute the meaning of the term.
Studying the meanings of terms, scholars apply both traditional linguistic methods typical of general vocabulary studies, and multidisciplinary methods, combining alternative methods and paradigms. The majority of studies state that terms are special in their meanings. Let us study the comparative semantic analysis of common words and terms given in Table 1.
Table 1. Comparative semantic analysis of common words and terms
If we look at the examples of domain descriptions, we will see these links and other elements of the conceptual systems.
Example 1
In the domain of metallurgy, ligatures are thus crucial components in the alloying process, significantly impacting the physical, chemical, and mechanical properties of metals and alloys. Their precise use and control are essential aspects of metallurgical engineering and materials science.
Alloying Element:
A ligature in metallurgy is often an alloying element or a combination of elements. These elements are added to a base metal to modify its properties, such as strength, ductility, corrosion resistance, or melting point.
Impurity Control:
Ligatures can be used to bind or control impurities within a metal, thereby enhancing the quality of the final product. They can either neutralize impurities or facilitate their removal during the refining process.
Microstructure Modification:
Adding ligatures can affect the microstructure of a metal, leading to improved mechanical properties. For instance, certain elements added to steel can influence its hardness or toughness.
Specialized Applications:
In some specialized metallurgical processes, ligatures may be used to impart specific characteristics required for particular applications, such as in aerospace, automotive, or electronic industries.
Form and Composition:
Ligatures can come in various forms, including powders, granules, or wires, and are added to the molten metal during the smelting or refining process.
The composition of a ligature is carefully chosen based on the desired effect on the base metal and the specific requirements of the metallurgical process.
Common examples include the addition of manganese, silicon, or chromium to steel to achieve desired properties. Each of these elements serves a specific purpose, such as improving tensile strength, resistance to wear and corrosion, or ductility.
Example 2
In the domain of music, a ligature is a notation indicating that a sequence of notes should be performed in a connected, fluid manner. It is a crucial element in articulating the phrasing and expressive qualities of a musical piece.
Notation:
In traditional music notation, a ligature often appears as a curved line (a slur) that connects the heads of the notes. This line indicates that the notes it spans should be played or sung in a connected, seamless manner.
In vocal music, a ligature can also indicate that multiple notes are to be sung to a single syllable of text.
Historical Usage:
The term "ligature" in early music had a different meaning. In medieval and Renaissance notation, it referred to a group of notes written together in a way that indicated a specific rhythmic and melodic relationship. This usage is largely historical and not commonly employed in modern music notation.
Instrumental Music:
In instrumental music, a ligature (or slur) signifies that the notes should be played without separation, often with a single bow stroke on string instruments or without tonguing between notes on wind instruments.
Vocal Music:
In singing, a ligature indicates that the notes should be sung smoothly and connectedly, blending one note into the next without breaks.
Interpretation:
The use of ligatures affects the phrasing and expressiveness of a piece. Musicians interpret these markings to create a desired emotional or stylistic effect, contributing to the overall musical expression.
Different from a Tie:
It's important to distinguish a ligature from a tie. A tie is a similar curved line that connects two notes of the same pitch, indicating that they should be held as a single sustained note. A ligature, on the other hand, connects notes of different pitches in a smooth progression.
These two examples show that dealing with a terms entails understanding the whole domain which means considering terminological system as a set of elements having specific definitions or representing notions and forming connections and relations within this system.
By using general vocabulary, speakers are free to employ various words to convey their personal vision and understanding of some situation, so they describe it in their own manner.
By using terms, speakers are obliged to represent a domain of systematised knowledge employing a pre-designed model (a system of notions) to describe a situation.
Thus, terms are essential for the communication of specialized knowledge. They facilitate the sharing of information and ideas within a field and between fields, playing a crucial role in knowledge dissemination and academic discourse. In LSP, terms provide precise and unambiguous ways to refer to concepts, processes, and objects that are unique to a specific domain. This precision is crucial for effective communication, particularly where misunderstanding or ambiguity can have serious consequences. The use of specialized terms contributes to a sense of professional identity and belonging within a community of practice. It can act as a marker of professional competence and expertise. For learners and newcomers to a field, understanding and using the correct terms is key to acquiring and demonstrating expertise. Terms act as building blocks for learning the language and concepts of the field.
3. Terminology in modern multidisciplinary paradigms and applied fields
Terminology and its development reflect progress in various professional fields, so it draws much attention of scholars whose aims are both theoretical and practical. Nowadays communication and science are globalised, and they have become digital and multilingual. Thus, modern approaches to studying terms, especially in the context of linguistics, translation studies, and terminology management, have evolved significantly with advancements in technology and interdisciplinary research. Here are some key trends of terminological studies:
3.1. Cognitive Linguistics
This approach focuses on how terms are understood and processed in the human mind. It examines the relationship between language, thought, and cultural context, providing insights into how terms acquire meaning and are used.
Semantic analysis of terminology establishes the idea that each term represents a notion. Notions are the results of human categorisation of the world representing the abstract ideas or mental constructs. They structure human experience converting it into systematised knowledge. To describe a notion, logical school of language studies and cognitive linguistics apply the terms
"volume" and "content" of a notion that refer to the scope and the specific information that a notion encompasses.
Volume of a notion refers to its breadth or scope. It encompasses how broad or narrow, general or specific, a notion is.
For example, the notion of "vehicle" has a large volume as it includes a wide range of items (cars, bikes, boats, planes, etc.). In contrast, the notion of "sedan" has a smaller volume, referring more specifically to a type of car. The volume of a notion determines how many objects or ideas can be classified under it.
Content of a notion refers to the specific attributes, characteristics, or information that define a particular notion. It includes the defining features or the essential qualities that make up the notion. For instance, the content of the notion "bird" might include attributes like feathers, beak, laying eggs, and the ability to fly (though not all birds fly, it's often a perceived characteristic). The content of a notion is what distinguishes it from other notions and helps in identifying and categorizing specific items or ideas within that concept.
The content and the volume of a notion are interdependent.
Content Determines Volume: The content of a notion, with its defining features and characteristics, essentially determines its volume. For example, the notion of "animal" has a broad volume because its content includes the fundamental characteristics of animals (living, breathing, moving organisms) but is not overly specific. This allows for a wide range of entities (from insects to whales) to be categorized under this notion.
Volume Influences Perception of Content: Conversely, the perceived volume of a notion influences our understanding of its content. A broader volume might lead to a more generalized or abstracted perception of content, whereas a narrower volume might result in a more detailed or specific understanding.
The application of cognitive approach to terminology are in their combination with IT and AI. There are two directions: ontologies and semantic networks. These are instruments developed by artificial intelligence, computer science, and information science to represent knowledge in structured forms. These are used for organizing and structuring knowledge about terms, showing the relationships between different terms and concepts. They play a crucial role in enabling machines to process, understand, and respond to complex information.
An ontology in the context of computer science is a structured framework for organizing information and represents formal knowledge as a set of concepts within a domain, and the relationships between those concepts. An ontology is an explicit specification of a conceptualization. It provides a shared vocabulary for a domain and defines the meaning of terms and the relationships between them.
Components:
Classes (or Concepts): Categories or types of objects or ideas within a domain.
Attributes: Features or properties that the objects can have.
Relations: The ways in which objects and classes can be related to one another.
Individuals: Instances or actual objects in the domain that the ontology describes.
In this way, the conceptual base or frame of ontology is taxonomy. Taxonomy provides a structured and systematic way to categorize and organize diverse entities based on shared characteristics or attributes is a classification system, and it plays a crucial role in various fields, including biology, information management, and commerce.
Taxonomy is a system or method of classifying and organizing things, typically living organisms, into hierarchical categories or groups based on their shared characteristics or attributes. The primary purpose of taxonomy is to provide a systematic and structured way to understand and categorize the diversity of life on Earth, making it easier for scientists and researchers to study and communicate about different species and their relationships.
The Linnaean taxonomy, developed by Carl Linnaeus in the 18th century, is one of the most well-known and widely used systems of taxonomy. It categorizes living organisms into a hierarchical structure consisting of several levels or ranks, from broad to specific:
Kingdom
Phylum
Class
Order
Family
Genus
Species
Each level represents a progressively more specific grouping of organisms. For example, all species within a particular genus share more characteristics in common than species in different genera within the same family. This hierarchical classification helps scientists organize and categorize the vast diversity of life.
Modern taxonomy also incorporates molecular and genetic data to complement traditional morphological characteristics, which has led to revisions and refinements in the classification of certain organisms.
Taxonomy is not limited to the classification of living organisms. It can be applied to other fields as well, such as:
Document and Information Management: Taxonomy is used to classify and organize documents, files, and information in a structured manner, making it easier to search, retrieve, and manage data within organizations.
Web Content and Information Architecture: In web development and content management, taxonomy is used to create hierarchical structures for organizing website content, improving user navigation and search functionality.
Library Science: Taxonomy is essential in library cataloguing systems to classify books, publications, and other library materials, ensuring efficient organization and retrieval of information.
Botanical Gardens and Zoos: Taxonomy is applied to organize and label plants and animals on display, helping visitors understand the relationships and characteristics of different species.
Business and Product Classification: In business and commerce, taxonomy is used to categorize products, services, and inventory, facilitating inventory management, e-commerce, and supply chain operations.
Creating an ontology involves a systematic process of defining concepts, relationships, and properties within a specific domain of knowledge. Here are the general steps to create an ontology, including conceptual part and IT part:
Conceptual part
1) Define the Scope and Purpose
Clearly define the scope and purpose of your ontology. What is the specific domain of knowledge you want to model, and what are the goals you want to achieve with your ontology?
2) Identify Concepts
Identify and list the key concepts or entities within your chosen domain. These concepts represent the building blocks of your ontology.
3) Define Relationships
Determine how the concepts are related to each other. Consider the different types of relationships that exist between concepts, such as "is a," "part of," "has property," "related to," etc.
4) Specify Properties
Define the properties or attributes associated with each concept. Properties describe the characteristics, features, or attributes of the concepts. These properties may include data types (e.g., string, integer) and constraints.
5) Create a Taxonomy or Hierarchy
Organize the concepts into a hierarchical structure or taxonomy. This hierarchy should reflect the relationships between concepts and their levels of specificity. Typically, this structure follows a tree-like or parent-child relationship.
IT part depends on the choice of a formal representation language to express your ontology. Common languages for ontology development include OWL (Web Ontology Language), RDF (Resource Description Framework), and RDFS (RDF Schema).
6) Create Ontology Classes
Create classes for each concept in your ontology. Classes serve as the formal representation of concepts and their properties. Specify the class hierarchy using subclass and superclass relationships.
7) Define Properties and Restrictions
Define properties and attribute restrictions for each class. This includes specifying which properties are applicable to each class and defining any domain and range restrictions on properties.
8) Add Instances
Create instances or individuals of the ontology classes. Instances represent specific real-world objects or entities within your domain. Link instances to their respective classes and specify their property values.
9) Establish Relationships
Define relationships between instances using the defined relationship properties. Connect instances to other instances based on the relationships they have in the real world.
10) Test and Validate
Test your ontology to ensure that it accurately represents the knowledge within the domain. Check for consistency, completeness, and logical correctness. Validation tools and reasoners can help with this step.
11) Document
Document your ontology thoroughly. Provide descriptions, definitions, and examples for concepts, relationships, and properties. Clear documentation is essential for users and future maintainers of the ontology.
12) Publish and Share
If your ontology is intended for broader use, consider publishing and sharing it with the relevant community or stakeholders. Make it available in a format that others can access and use.
13) Maintain and Evolve
Ontologies are not static; they may need to evolve over time as knowledge in the domain changes. Regularly update and maintain your ontology to ensure its relevance and accuracy.
Ontologies have various key applications across different domains, and they are used by a wide range of professionals and industries. Some of the key applications of ontologies include:
Knowledge Representation: Ontologies are primarily used for representing knowledge in a structured and standardized format. They provide a formal and explicit way to define concepts, their relationships, and properties, making it easier to share and understand knowledge.
Semantic Web: Ontologies play a crucial role in the Semantic Web, enabling machines to understand and interpret the meaning of web content. They facilitate data integration and interoperability by providing a common vocabulary for describing data and information on the web.
Information Retrieval: Ontologies can improve information retrieval systems by allowing for more precise and context-aware search queries. They help in matching user queries with relevant documents and data by considering semantic relationships.
Natural Language Processing (NLP): Ontologies are used in NLP applications for disambiguation, entity recognition, and sentiment analysis. They help machines understand the meaning of words and phrases in context.
Healthcare: Ontologies are applied in healthcare to standardize medical terminologies, define relationships between medical concepts, and support decision support systems, clinical data integration, and medical knowledge management.
Bioinformatics: Ontologies are used to represent biological and genomic data, aiding in the integration of various biological databases and enabling researchers to make sense of complex biological relationships.
Robotics and AI: Ontologies are employed in robotics and artificial intelligence to represent domain knowledge, making it easier for robots and AI systems to understand and interact with the real world.
E-commerce: Ontologies help in product classification, recommendation systems, and product search by providing a structured way to describe products, their features, and relationships.
Information Governance: Enterprises use ontologies to manage and categorize their data, ensuring data consistency, quality, and compliance with industry standards and regulations.
Geography and Geospatial Applications: Ontologies are used to represent geospatial information, enabling systems to understand and process location-based data, such as maps, GPS data, and geographic information systems (GIS).
Education and E-Learning: Ontologies support the development of intelligent tutoring systems and personalized learning platforms by modelling educational content, learning objectives, and learner profiles.
Industry-specific Applications: Various industries, such as finance, manufacturing, aerospace, and energy, use ontologies to model domain-specific knowledge and facilitate data integration and decision-making.
Who Uses Ontologies:
Researchers and scientists in various fields, including computer science, biology, and medicine, use ontologies to formalize and share their knowledge.
Software developers and engineers use ontologies to build semantic applications and systems that can understand and process data more intelligently.
Knowledge engineers and experts design and maintain ontologies to capture and represent domain-specific knowledge.
Businesses and organizations in various industries leverage ontologies to improve data management, decision support, and knowledge sharing.
Government agencies use ontologies for data integration, policy modelling, and information retrieval.
Semantic web developers and architects work on projects related to the Semantic Web, where ontologies are fundamental to achieving the web's vision of machine-readable data and interconnected information.
In summary, ontologies are versatile tools with applications in a wide range of domains and are used by professionals, researchers, and organizations to represent, share, and leverage knowledge effectively.
Semantic Networks
Definition: A semantic network is a graphical representation of knowledge that depicts relationships between concepts. It is a form of knowledge representation that visualizes concepts (or nodes) and the connections (or edges) between them.
Characteristics: Semantic networks are often used for associative representations, where the links between nodes represent the relationship between the ideas. These networks can be simple, with only one kind of relationship, or complex, with multiple relationship types.
Uses: They are used in natural language processing, cognitive science, and knowledge representation. They help in understanding and modelling how human beings process and structure knowledge.
Open access products based on semantic networks provide access to knowledge and data in various domains. Here are some examples:
Wikidata: Wikidata is an open knowledge base that uses a semantic network to store structured data about a wide range of topics, including people, places, and concepts. It serves as a central hub for linked data and is used by Wikipedia and other projects.
DBpedia: DBpedia is a project that extracts structured information from Wikipedia and represents it as a semantic network. It provides structured data about people, places, and things described in Wikipedia articles.
Linked Open Data Cloud: The Linked Open Data (LOD) Cloud is a collection of linked datasets from various sources that are interconnected through semantic relationships. These datasets cover diverse domains, such as culture, science, government, and more.
Freebase (Now part of Wikidata): Freebase was a community-driven knowledge graph that aimed to organize information about the world's people, places, and things. It was acquired by Google and later contributed to Wikidata.
YAGO: YAGO (Yet Another Great Ontology) is a knowledge base and ontology that contains information about millions of entities and their relationships, making it suitable for tasks like entity linking and knowledge retrieval.
WordNet: WordNet is a lexical database of the English language that is structured as a semantic network of words and their relationships, including synonyms, hypernyms (is-a relationships), and hyponyms.
SNOMED CT: SNOMED CT (Systematized Nomenclature of Medicine - Clinical Terms) is a comprehensive clinical terminology and ontology that represents relationships between medical concepts, making it valuable for healthcare informatics.
Linked Data from Government Sources: Various government agencies and organizations worldwide publish linked open data, including information about government services, statistical data, and geographic information.
BabelNet: BabelNet is a multilingual semantic network that connects words, phrases, and concepts in multiple languages, providing a valuable resource for natural language processing and machine translation.
Cyc: Cyc is a knowledge base and ontology that aims to capture common-sense knowledge about the world. It includes information about everyday concepts and their relationships.
Differences and Similarities
Common Ground: Both ontologies and semantic networks are used to represent knowledge, but they do so in slightly different ways. Ontologies are more rigid in structure and are concerned with formalizing the types of objects and their interrelations in a domain, while semantic networks are often more flexible and visually oriented.
Purpose: Ontologies are typically used for larger, more complex systems that require a detailed and formal representation of knowledge in a domain, such as in the Semantic Web. Semantic networks are more suited for tasks like cognitive modelling and understanding language processing.
Representation: Ontologies often require a more formal language for definition, like OWL (Web Ontology Language), while semantic networks can be represented using simpler graphical forms.
In summary, ontologies and semantic networks are powerful tools in organizing and representing knowledge. They enable machines to process complex information and are fundamental in fields like AI, natural language processing, and knowledge management.
Looking at the application of the basics of cognitive approach, we can see the interdisciplinary approach of this paradigm. The study of terms often involves a blend of linguistics, information science, cognitive psychology, and domain-specific knowledge. This interdisciplinary approach ensures a comprehensive understanding of terminology in context.
Another trend is crowdsourcing and employment of collaborative platforms, such as Wiktionary or specialized forums that allow for the collective creation and refinement of terminological databases, benefiting from the knowledge of a vast and diverse user base. The use of social media and online communities is in the same vein. The study of language and terms now often includes analysis of informal and evolving usage patterns as seen in social media, forums, and online communities.
3.2. Corpus and Computational Linguistics
This involves analysing large bodies of text (corpora) to study the frequency, usage patterns, and contexts of terms. It helps in understanding how language is used in real-life settings, which is essential for accurate term translation and usage.
Corpus linguistics identifies samples of real-world text and has made significant contributions to terminological studies and the development of terminology management instruments. Here's an overview of these contributions:
Term Extraction and Identification:
Corpus linguistics enables the automatic extraction of potential terms from large volumes of text. This is particularly useful for identifying specialized vocabulary in specific fields or domains.
Understanding Contextual Usage:
By analysing how terms are used in context, corpus linguistics helps in understanding the nuanced meanings of terms, their connotations, and the circumstances of their use.
Terminology Standardization:
Corpora can provide evidence for preferred or more frequent usage of terms within a community, aiding in the process of standardizing terminology across a field.
Terminology Database Development:
Insights gained from corpus analysis are used to develop and enrich terminological databases, ensuring that they are up-to-date and reflect current usage.
Multilingual Terminology Work:
Multilingual corpora allow for the comparison of terms across languages, aiding in the process of translation and the creation of bilingual or multilingual terminological resources.
Development of Glossaries and Dictionaries:
Corpus linguistics provides empirical data for the creation of specialized glossaries and dictionaries, which are essential tools in terminology management.
Language Variation and Change:
Analysing corpora over time helps in tracking changes in language and terminology, which is crucial for keeping terminological resources relevant and accurate.
Semantic Analysis:
The study of corpora helps in understanding the relationships between terms and concepts (semantic fields), which is important for organizing and structuring terminological knowledge.
Quality Control in Translation:
Corpora are used to ensure the consistency and accuracy of terminology in translation work, which is a key aspect of quality control in this field.
Training and Education:
Corpus-based studies are used in training translators, interpreters, and terminology managers, providing them with insights into practical, real-world language use.
Customized Corpus Development:
Specific corpora can be developed for particular fields or projects, providing tailored resources for terminological analysis and management.
Supporting Natural Language Processing (NLP) Applications:
In general, computational linguistics may provide a wide range of tools that contribute to terminological management. For example, we can compare and contrast collocations, represented by national corpora, and find correspondent terminological expressions (see Pic.1)
Picture 1. Collocations of ‘chemical’ and ‘химический’ from Russian and American corpora.
Corpus linguistics contributes to the development of NLP applications, including automated term recognition and extraction tools, which are increasingly important in terminology management.
The data collected within corpus-based approach are widely used by another direction of DH that is NLP. Natural Language Processing (NLP) is a branch of artificial intelligence that focuses on enabling computers to understand, interpret, and respond to human language in a valuable way. NLP combines computational linguistics – rule-based modelling of human language with statistical, machine learning, and deep learning models. These technologies enable computers to process human language in the form of text or voice data and to 'understand' its full meaning, complete with the speaker or writer's intent and sentiment.
NLP has a wide range of applications, including text translation, sentiment analysis, speech recognition, chatbots, search engines, text summarization, and much more.
Let us take sentimental analysis as an example and study the process of NLP.
Sentiment Analysis, also known as opinion mining, is a field within Natural Language Processing (NLP) that focuses on identifying and categorizing opinions expressed in text, especially to determine whether the writer's attitude towards a particular topic, product, etc., is positive, negative, or neutral. This technique is widely used for understanding customer sentiments in reviews, social media posts, and other textual content. Here's a more detailed overview:
How Sentiment Analysis Works:
Text Processing: Involves cleaning and preparing text data for analysis, including tasks like tokenization (breaking text into words or phrases), removing stop words (common words that don't contribute to the meaning), and stemming or lemmatization (reducing words to their base form).
Feature Extraction: Transforming processed text into a format that machine learning algorithms can understand, often using techniques like bag-of-words or word embeddings.
Sentiment Classification: Applying machine learning or deep learning algorithms to classify the sentiment of the text. This can be a binary classification (positive or negative), ternary (positive, negative, neutral), or even on a scale (very positive, somewhat positive, neutral, somewhat negative, very negative).
Context and Tone Understanding: Advanced sentiment analysis involves understanding context and tone, which can be challenging as it requires the algorithm to recognize things like sarcasm, irony, or subtlety.
Applications:
Business and Marketing: Analysing customer feedback, reviews, and social media posts to gauge public opinion about products or services.
Politics: Monitoring public opinion on political issues, campaigns, or politicians.
Finance: Sentiment analysis of news articles, reports, or social media to predict stock market trends.
Healthcare: Analysing patient feedback, responses, and reviews about treatments or healthcare services.
Customer Service: Automating responses and prioritizing customer queries based on sentiment.
Tools and Technologies:
Python Libraries: NLTK, TextBlob, spaCy, and scikit-learn offer tools for sentiment analysis.
APIs and Platforms: Google Cloud Natural Language, IBM Watson, and Amazon Comprehend provide sentiment analysis as part of their NLP services.
Deep Learning Frameworks: TensorFlow and PyTorch for building custom sentiment analysis models using neural networks.
Challenges:
Sarcasm and Irony: Detecting sarcasm and irony in text is a significant challenge as it often requires understanding subtle cues and context.
Domain-Specific Language: Sentiment indicators can vary greatly across different domains, making it necessary to tailor models to specific areas.
Multilingual Analysis: Analysing sentiment in languages other than English, especially those with less NLP resource support, can be complex.
There are several open-source or open-access NLP tools available, which are widely used in academia and industry for various NLP tasks:
Natural Language Toolkit (NLTK):
A popular Python library providing easy-to-use interfaces to over 50 corpora and lexical resources, along with libraries for text processing for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.
SpaCy:
An industrial-strength, Python-based NLP library that emphasizes efficiency and accuracy. SpaCy is designed specifically for production use and offers many pre-built models for various languages.
Stanford NLP:
A suite of NLP tools provided by Stanford University. It includes software for part-of-speech tagging, named entity recognizer (NER), neural dependency parser, and much more, often used in academic research.
Apache OpenNLP:
A machine learning-based toolkit for processing natural language text, supporting common NLP tasks such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution.
Gensim:
A Python library for topic modelling and document similarity analysis. Particularly known for its implementation of the Word2Vec model.
BERT and Transformers (by Hugging Face):
BERT (Bidirectional Encoder Representations from Transformers) is a transformer-based model known for its effectiveness in a wide range of NLP tasks. Hugging Face provides a library of pre-trained transformers including BERT and others, which are highly influential in modern NLP.
Tesseract OCR:
An optical character recognition engine, useful for reading text from images and converting them into editable text formats.
FastText:
Developed by Facebook, FastText is an open-source, free, lightweight library that allows users to learn text representations and text classifiers, with support for several languages.
CoreNLP:
Developed by Stanford University, it provides a set of natural language analysis tools which can identify the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, and mark up the structure of sentences in terms of phrases and syntactic dependencies.
AllenNLP:
An open-source NLP research library, built on PyTorch, designed for high-quality and efficient research in deep learning-based NLP.
In summary, corpus linguistics significantly enhances terminological studies and management by providing empirical data about language use, aiding in term identification and standardization, enriching terminological resources, and supporting the development of tools and applications for effective terminology management.