|
|
Grigori SIDOROV
PhD,
Professor and researcher, Regular
member of Phone: 52-55-57296000 ext. 56518, 56544 e-mail: |
text processing techniques and systems, automatic dictionary processing,
automatic morphological analysis of different languages, automatic syntactic
analysis, anaphora resolution, word sense disambiguation, corpus linguistics,
parallel texts, linguistic software development.
LICENSE:
1. You can use all
these programs freely for academic purposes. No warranty.
2. You should inform
us about the usage of the programs, and
3. You should cite the
corresponding papers in your publications obtained with the help of these
programs.
Downloading means that you
accept the license. Thank you.
English-Spanish dictionary of weighted morphological forms. Forms are weighted
according to the distributions of corresponding grammar classes in corpora.
For example:
'cause porque 1.0000000
'til hasta 1.0000000
a un 0.4603677
a una 0.3662918
a unas 0.0734382
a uno 0.0031157
a unos 0.0967866
abaci ábaco 0.0561639
abaci ábacos 0.9438361
abacus ábaco 0.9890721
abacus ábacos 0.0109279
abacuses ábaco 0.0561639
abacuses ábacos 0.9438361
abandon abandonábamos 0.0024804
abandon abandonáis 0.0005694
abandon abandonáramos 0.0004860
abandon abandonáremos 0.0007113
abandon abandonásemos 0.0004860...
...abandon abandonaba 0.0779384
abandon abandonabais 0.0000805
abandon abandonaban 0.0226584...
In Unicode.
Paper for citing for English-Spanish dictionary
of weighted morphological forms:
Grigori Sidorov, Alberto
Barrón-Cedeño and Paolo Rosso. English-Spanish Large Statistical Dictionary of Inflectional Forms. In: Proceedings of the Seventh International Conference on
Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA),
2010, pp. 277-281.
Interface for
the system for fast search of Maya glyphs based on their visual structural
description: Compressed as EXE file or Compressed
as ZIP file.
Beta-version. The system uses
the dictionary of J. Montgomery.
EXE: Download the Glyphs.exe
file, execute it, the files will be copied to the folder you choose. Then
execute the file SETUP.EXE.
ZIP: Download the Glyphs.zip
file, unzip files to the folder you choose . Then execute the file SETUP.EXE.
Papers for citing for glyph search system:
1.
Obdulia
Pichardo Lagunas, Grigori Sidorov. Diccionario
de los glifos maya con descripción visual estructural. In: Proc. of
International Conference EURALEX-2008, Barcelona, Spain, July 2008, pp 747-751.
2.
Grigori Sidorov, Obdulia Pichardo-Lagunas,
and Liliana Chanona-Hernandez.
Search
Interface to a Mayan Glyph Database based on Visual Characteristics.
Lecture Notes in Computer Science, Vol. 5723, Springer-Verlag,
2009, pp. 222-229
System
for automatic morphological analysis of Spanish NEW: A complete wordlist (beta-version) generated with this system is
available.
System
for automatic morphological analysis of Russian
These are EXE files for
Windows; DLLs are available on request.
These are the programs that
perform lemmatization and provide grammar information of each word form of
Spanish or Russian correspondingly.
See detailed description on
the corresponding pages – follow the links.
Paper for citing for morphological analysis
systems:
A. Gelbukh, G. Sidorov.
Approach to
construction of automatic morphological analysis systems for inflective
languages with little effort. In:
Computational Linguistics and Intelligent Text Processing (CICLing-2003),
Lecture Notes in Computer Science, N 2588, Springer-Verlag,
2003, pp. 215–220.
Download concordances for Russian
(EXE for Windows). This is a program that allows for construction of
concordances for Russian language. Its interesting feature is that it can
construct concordances for a set of grammar categories, e.g., all nouns in dative, singular.
Paper for citing:
G. O. Sidorov. Lemmatization
in automatized system for compilation of personal style dictionaries of
literature writers. – Chapter in: “Word of Dostoyevsky”,
Download
parser with Spanish grammar (EXE and DLL for Windows). This is a chart parser
that uses a CF grammar with elements of unification. Experimental CF grammar
for Spanish is provided along with tools for its modifications.
Paper for citing:
A. Gelbukh, G. Sidorov,
S. Galicia Haro,
More
than 140 scientific publications, 1 patent.
More
than 150 references to my works (without self-citing).
− Who’s Who in the World.
− Who’s Who in Science and Engineering.
−
Editor-in-Chief of the
research journal “Polibits”.
1. Gelbukh, G. Sidorov, A.
Guzman-Arenas. Use of a weighted topic
hierarchy for text retrieval and classification. In Václav Matoušek et al. (Eds.). Text, Speech
and Dialogue. Proc. 2nd International Workshop TSD-99, Plzen, Czech Republic, September 13-17, 1999. Lecture
Notes in Artificial Intelligence, No. 1692, Springer,
pp. 130–135.
2. Gelbukh, G. Sidorov, and A.
Guzmán-Arenas. A Method of Describing Document Contents through Topic Selection. Proc. SPIRE’99, International Symposium
on String Processing and Information Retrieval, Cancun, Mexico, September 22 – 24. IEEE Computer Society Press, 1999, pp. 73-80.
3. Alexander F. Gelbukh and Grigori
Sidorov. On Indirect Anaphora
Resolution.
Proc. PACLING-99, Pacific Association for Computational Linguistics, ISBN
0-9685753-0-7, University of Waterloo, Waterloo,
Ontario, Canada, August
25-28, 1999, pp. 181-190
4. Grigori Sidorov, Alexander Gelbukh.
Demonstrative pronouns as markers of indirect anaphora. Proc. 2nd International Conference
on Cognitive Science and 16th Annual Meeting
of the Japanese Cognitive Science Society Joint Conference
(ICCS/JCSS99), July 27-30, 1999, Tokyo, Japan, pp. 418-423
5. Alexander Gelbukh and Grigori
Sidorov. Approach to construction
of automatic morphological analysis systems for inflective languages with little effort. In: Computational Linguistics and Intelligent Text Processing. Proc. CICLing-2003, 4th International Conference
on Intelligent Text Processing and Computational Linguistics, February 15–22,
2003, Mexico City. Lecture
Notes in Computer Science (indexed by SCIE), N 2588,
Springer-Verlag, pp. 215–220.
6. Alexander Gelbukh, Grigori Sidorov, and Liliana Chanona-Hernández.
Compilation of a Spanish representative corpus. Proc. CICLing-2002,
Conference on Intelligent Text Processing and Computational Linguistics, February 16–23, 2002, Mexico
City. Lecture Notes in Computer
Science N 2276, Springer-Verlag,
pp. 285–288.
7. Alexander Gelbukh and Grigori
Sidorov. Automatic
Selection of Defining Vocabulary in an Explanatory Dictionary. Proc. CICLing-2002, Conference on Intelligent Text Processing and Computational Linguistics, February 16–23,
2002, Mexico City. Lecture
Notes in Computer Science N
2276, Springer-Verlag, pp. 300–303.
8. Alexander Gelbukh, Grigori Sidorov, San-Yong Han, and Erika
Hernández-Rubio. Automatic Enrichment of Very Large Dictionary
of Word Combinations on the Basis of Dependency
Formalism. Lecture
Notes in Artificial Intelligence N 2972, 2004, ISSN
0302-9743, Springer-Verlag, pp
430-437. (discussion of collocation
concept).
9. Alexander Gelbukh and Grigori
Sidorov. Alignment
of Paragraphs in Bilingual Texts using Bilingual Dictionaries and Dynamic Programming. Lecture Notes in Computer Science, N 4225, ISSN 0302-9743, Springer-Verlag, 2006, pp 824-833. (methods of alignment of parallel texts)
10. Gaspár Ramírez, James L. Fidelholtz,
Héctor Jiménez, Grigori Sidorov.
Elaboración de un diccionario de verbos del español a
partir de una lexicografía sistemática. In: “Avances en
11. Alexander Gelbukh, Grigori Sidorov, SangYong Han. On Some Optimization Heuristics for Lesk-Like WSD Algorithms. Lecture Notes in Computer Science, N 3513, ISSN 0302-9743, Springer-Verlag,
2005, pp. 402–405.
12. Alexander Gelbukh and Grigori
Sidorov. Zipf and Heaps Laws’ Coefficients Depend on Language.
Lecture Notes in Computer Science N 2004, 2001, ISSN 0302-9743, Springer-Verlag, pp. 330–333.
13. Castro-Sánchez, N. A., Sidorov,
G. Automatic Acquisition of Synonyms of Verbs from an Explanatory
Dictionary using Hyponym and Hyperonym Relations. Lecture Notes in Computer Science, vol. 6718,
2011.
14. María de los Ángeles Alonso-Lavernia,
Argelio Víctor De-la-Cruz-Rivera, and Grigori Sidorov. Generation of Natural Language Explanations
of Rules in an Expert
System. Lecture Notes in Computer Science N 3878,
Springer-Verlag, ISSN 0302-9743, 2006, pp. 311-314.