Definition of

Extraction

ExtractionThe word extractio , from medieval Latin, derived in our language into extraction . This term refers to the act and consequence of extracting : remove, extirpate, eliminate.

For example: "The dentist told me that, two hours before the tooth extraction, I must take an antibiotic to avoid infections" , "Extracting clams is prohibited since it is an endangered animal" , " Environmentalists claim that gold mining will destroy the mountain and cause irreversible damage to the ecosystem .

We can find different types of extraction in multiple areas. When a person approaches an automated teller machine (ATM), they can make a withdrawal and withdraw money from their bank account, taking with them the bills that the machine gives them.

Blood collection , on the other hand, is a procedure that is carried out in the field of nursing. By drawing blood from a patient, the sample can be analyzed and valuable information about the individual's health can be obtained.

In the context of dentistry , extraction is a surgery that consists of removing a tooth or part of it. In this framework, the dentist uses certain instruments and applies his knowledge and skills to achieve the objective.

Focusing on computing , information extraction is an operation that is carried out to recover content from a database. The process can be carried out automatically if the information is structured.

The extraction of structured or semi-structured information is part of the recovery tasks, and is carried out using documents that can be read by the computer. For example, this process takes place when certain handwritten documents are scanned to interpret their data and bring them to a digital database; That is, there must be an application that recognizes the text and converts it into information that can be stored and edited, instead of simply leaving it in graphic format.

The form of the texts varies depending on the project and the intentions of those who carry out the information extraction: in some cases, these are structured forms, which have usually been created by the company itself that is trying to extract the information present in them after have been completed by third parties; but they can also be unstructured texts, such as journalistic articles or fiction books.

Here the concept of natural language comes into play, which refers to a linguistic variety typical of human beings that is created with the objective of communicating and that is based on a specific syntax and complies with the principles of optimality and economy of language. Text sources that can be used for information extraction must contain messages written in a language of this type.

Among the most common tasks of information extraction are the following:

Extraction* name recognition : whether it is the name of a person, a company or a place, or even monetary values ​​or other expressions belonging to predefined categories, information extraction serves to search and classify them;

* coreference resolution : this is the detection of coreference between the entities of a given document, such as that between the full name of a company and its acronym;

* terminology extraction : in this case, the process consists of analyzing a text to identify the semantic arguments that are linked to the verbs, to establish a classification according to their roles. For example, in the sentence "Marisa bought a PDA from Valeria" , "Marisa" is recognized as the buying agent , the "PDA" is the object , "bought" is the verb and "Valeria" is the selling agent .

In mining , finally, extraction is the activity that allows us to obtain minerals from a deposit and then exploit them commercially: copper extraction, lithium extraction, etc.