Descripción: Descripción: Descripción: Descripción: Descripción: Descripción: Descripción: sidorov

Grigori SIDOROV

PhD, Professor and researcher,
Natural Language and Text Processing Laboratory,
Center for Computer Research (CIC),
National Polytechnic Institute (IPN),
Mexico City, Mexico.

Regular member of Mexican Academy of Sciences,
Member of National System of Researchers of Mexico (SNI), level 3 (highest)

Phone: 52-55-57296000 ext. 56518, 56544
Mobile: 55-91887293

 

e-mail: Descripción: Descripción: Descripción: Descripción: Descripción: Descripción: Descripción: SidorovMail

 

Curriculum Vitae

 

Areas of interest

text processing techniques and systems, automatic dictionary processing, automatic morphological analysis of different languages, automatic syntactic analysis, anaphora resolution, word sense disambiguation, corpus linguistics, parallel texts, linguistic software development.

 

Current projects:

  • Linguistic tools;
  • Parallel texts;
  • Automatic analysis of explanatory dictionaries.

 

Downloads:

LICENSE:

1.      You can use all these programs freely for academic purposes. No warranty.

2.      You should inform us about the usage of the programs, and

3.      You should cite the corresponding papers in your publications obtained with the help of these programs.

 

Downloading means that you accept the license. Thank you.

 

English-Spanish dictionary of weighted morphological forms. Forms are weighted according to the distributions of corresponding grammar classes in corpora.

For example:

'cause porque 1.0000000

'til hasta 1.0000000

a un 0.4603677

a una 0.3662918

a unas 0.0734382

a uno 0.0031157

a unos 0.0967866

abaci ábaco 0.0561639

abaci ábacos 0.9438361

abacus ábaco 0.9890721

abacus ábacos 0.0109279

abacuses ábaco 0.0561639

abacuses ábacos 0.9438361

abandon abandonábamos 0.0024804

abandon abandonáis 0.0005694

abandon abandonáramos 0.0004860

abandon abandonáremos 0.0007113

abandon abandonásemos 0.0004860...

...abandon abandonaba 0.0779384

abandon abandonabais 0.0000805

abandon abandonaban 0.0226584...

In Unicode.

Paper for citing for English-Spanish dictionary of weighted morphological forms:

Grigori Sidorov, Alberto Barrón-Cedeño and Paolo Rosso. English-Spanish Large Statistical Dictionary of Inflectional Forms. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA), 2010, pp. 277-281.

 

 

Interface for the system for fast search of Maya glyphs based on their visual structural description: Compressed as EXE file or Compressed as ZIP file.

Beta-version. The system uses the dictionary of J. Montgomery.

EXE: Download the Glyphs.exe file, execute it, the files will be copied to the folder you choose. Then execute the file SETUP.EXE.

ZIP: Download the Glyphs.zip file, unzip files to the folder you choose . Then execute the file SETUP.EXE.

 

Papers for citing for glyph search system:

1.        Obdulia Pichardo Lagunas, Grigori Sidorov. Diccionario de los glifos maya con descripción visual estructural. In: Proc. of International Conference EURALEX-2008, Barcelona, Spain, July 2008, pp 747-751.

2.        Grigori Sidorov, Obdulia Pichardo-Lagunas, and Liliana Chanona-Hernandez. Search Interface to a Mayan Glyph Database based on Visual Characteristics. Lecture Notes in Computer Science, Vol. 5723, Springer-Verlag, 2009, pp. 222-229

 

System for automatic morphological analysis of Spanish NEW: A complete wordlist (beta-version) generated with this system is available.

 

System for automatic morphological analysis of Russian

 

These are EXE files for Windows; DLLs are available on request.

These are the programs that perform lemmatization and provide grammar information of each word form of Spanish or Russian correspondingly.

See detailed description on the corresponding pages – follow the links.

 

Paper for citing for morphological analysis systems:

A. Gelbukh, G. Sidorov. Approach to construction of automatic morphological analysis systems for inflective languages with little effort. In: Computational Linguistics and Intelligent Text Processing (CICLing-2003), Lecture Notes in Computer Science, N 2588, Springer-Verlag, 2003, pp. 215–220.

 

 

 

Download concordances for Russian (EXE for Windows). This is a program that allows for construction of concordances for Russian language. Its interesting feature is that it can construct concordances for a set of grammar categories, e.g., all nouns in dative, singular.

 

Paper for citing:

G. O. Sidorov. Lemmatization in automatized system for compilation of personal style dictionaries of literature writers. – Chapter in: “Word of Dostoyevsky”, Moscow, Russia, Russian Academy of Sciences, 1996. pp. 266-300.

 

Download parser with Spanish grammar (EXE and DLL for Windows). This is a chart parser that uses a CF grammar with elements of unification. Experimental CF grammar for Spanish is provided along with tools for its modifications.

 

Paper for citing:

A. Gelbukh, G. Sidorov, S. Galicia Haro, I. Bolshakov. Environment for Development of a Natural Language Syntactic Analyzer. In: Acta Academia 2002, Moldova, 2002, pp.206-213

 

Publications:

More than 140 scientific publications, 1 patent.

More than 150 references to my works (without self-citing).

 

Distinctions:

     Who’s Who in the World.

      Who’s Who in Science and Engineering.

      Editor-in-Chief of the research journal “Polibits”.

Qualifications:

  • Lomonosov” Moscow State University, 1996
    Candidate of Philological sciences (Ph.D.) (Structural, applied and mathematical linguistics)
    Thesis: “Design and implementation of linguistic models, algorithms, and data for the systems with morphological analysis and generation for Russian language”;
  • Lomonosov” Moscow State University, 1983-1988
    (M.C. & B.C.) Philological Faculty, Department of Structural and Applied Linguistics;

Selected publications:

1.      Gelbukh, G. Sidorov, A. Guzman-Arenas. Use of a weighted topic hierarchy for text retrieval and classification. In Václav Matoušek et al. (Eds.). Text, Speech and Dialogue. Proc. 2nd International Workshop TSD-99, Plzen, Czech Republic, September 13-17, 1999. Lecture Notes in Artificial Intelligence, No. 1692, Springer, pp. 130–135.

2.      Gelbukh, G. Sidorov, and A. Guzmán-Arenas. A Method of Describing Document Contents through Topic Selection. Proc. SPIRE’99, International Symposium on String Processing and Information Retrieval, Cancun, Mexico, September 22 – 24. IEEE Computer Society Press, 1999, pp. 73-80.

3.      Alexander F. Gelbukh and Grigori Sidorov. On Indirect Anaphora Resolution. Proc. PACLING-99, Pacific Association for Computational Linguistics, ISBN 0-9685753-0-7, University of Waterloo, Waterloo, Ontario, Canada, August 25-28, 1999, pp. 181-190

4.      Grigori Sidorov, Alexander Gelbukh. Demonstrative pronouns as markers of indirect anaphora. Proc. 2nd International Conference on Cognitive Science and 16th Annual Meeting of the Japanese Cognitive Science Society Joint Conference (ICCS/JCSS99), July 27-30, 1999, Tokyo, Japan, pp. 418-423

5.      Alexander Gelbukh and Grigori Sidorov. Approach to construction of automatic morphological analysis systems for inflective languages with little effort. In: Computational Linguistics and Intelligent Text Processing. Proc. CICLing-2003, 4th International Conference on Intelligent Text Processing and Computational Linguistics, February 15–22, 2003, Mexico City. Lecture Notes in Computer Science (indexed by SCIE), N 2588, Springer-Verlag, pp. 215–220.

6.      Alexander Gelbukh, Grigori Sidorov, and Liliana Chanona-Hernández. Compilation of a Spanish representative corpus. Proc. CICLing-2002, Conference on Intelligent Text Processing and Computational Linguistics, February 16–23, 2002, Mexico City. Lecture Notes in Computer Science N 2276, Springer-Verlag, pp. 285–288.

7.      Alexander Gelbukh and Grigori Sidorov. Automatic Selection of Defining Vocabulary in an Explanatory Dictionary. Proc. CICLing-2002, Conference on Intelligent Text Processing and Computational Linguistics, February 16–23, 2002, Mexico City. Lecture Notes in Computer Science N 2276, Springer-Verlag, pp. 300–303.

8.      Alexander Gelbukh, Grigori Sidorov, San-Yong Han, and Erika Hernández-Rubio.  Automatic Enrichment of Very Large Dictionary of Word Combinations on the Basis of Dependency Formalism. Lecture Notes in Artificial Intelligence N 2972, 2004, ISSN 0302-9743, Springer-Verlag, pp 430-437. (discussion of collocation concept).

9.      Alexander Gelbukh and Grigori Sidorov. Alignment of Paragraphs in Bilingual Texts using Bilingual Dictionaries and Dynamic Programming. Lecture Notes in Computer Science, N 4225, ISSN 0302-9743, Springer-Verlag, 2006, pp 824-833. (methods of alignment of parallel texts)

10.  Gaspár Ramírez, James L. Fidelholtz, Héctor Jiménez, Grigori Sidorov. Elaboración de un diccionario de verbos del español a partir de una lexicografía sistemática. In: “Avances en la Ciencia de la computación”, Proc. of 7 Int .Conf.  ENC-2006, San Luís Potosi, México, 2006, pp.270-275.

11.  Alexander Gelbukh, Grigori Sidorov, SangYong Han. On Some Optimization Heuristics for Lesk-Like WSD Algorithms. Lecture Notes in Computer Science, N 3513, ISSN 0302-9743,  Springer-Verlag, 2005, pp. 402–405.

12.  Alexander Gelbukh and Grigori Sidorov. Zipf and Heaps LawsCoefficients Depend on Language. Lecture Notes in Computer Science N 2004, 2001, ISSN 0302-9743, Springer-Verlag, pp. 330–333.

13.  Castro-Sánchez, N. A., Sidorov, G. Automatic Acquisition of Synonyms of Verbs  from an Explanatory Dictionary using Hyponym and Hyperonym Relations. Lecture Notes in Computer Science, vol. 6718, 2011.

14.  María de los Ángeles Alonso-Lavernia, Argelio Víctor De-la-Cruz-Rivera, and Grigori Sidorov. Generation of Natural Language Explanations of Rules in an Expert System. Lecture Notes in Computer Science N 3878, Springer-Verlag, ISSN 0302-9743, 2006, pp. 311-314.

You can find more information about the papers, about our laboratory and about the annual International Conference on computational linguistics CICLing (Mexico City, published by Springer-Verlag) on the page of Alexander Gelbukh: www.gelbukh.com.