Front page

dBase version of the WordNet(1.6) lexicon database

The database was created using Visual dBase 5.7.

It is a quite large database, requiring more disk space than the original WordNet , as the following numbers indicate.

.dbf files

.zip file 8.8 MB
Total size, when unpacked, 65 MB

Index files

You don't necessarily need them if you, for example, convert the .dbf files to some other database system.

.zip file 5.5 MB
Total size 35 MB



Database schema

The following .dbf files, shown in the diagram, and their index (.mdx) files are included. Other files, such as Exception list and Lexicographer file names are missing in the .zip files at the moment.

   WORD_IDX
      |
      --> WORD_SNS
            |
             --> SYNSETS        <--
                   |               |
                    --> WORDS      |
                   |               |
                    --> POINTERS --
                   |
                    --> FRAMES

The database structure is described in the table list below. For more infomation about the database fields, see the WordNet documentation ( Format of Wordnet database files (1.7.1)).


Applications

Natural Language Parser, Machine Translator
http://www.teemapoint.com/nlpdemo/servlet/ParserServlet


Notes:

In Word_idx, p_cnt+ptr symbols are missing. However, they can be accessed via Word_sns-Synsets link?

In Synsets, untested field: ss_type.
gloss: not every gloss fit into this field which is 254 characters in length. Memo field for this purpose took over 100 MB disk space.

Words: case-sensitive words, as they appear in synset lists.
head_word is not in use.
untested fields: marker, lex_id.

in Pointers: s_ means source, t_ target.

Pointers and Frames:
word_nr (if not 0) means words in the synset beginning from this order number.


Database structure

Structure for table   C:\WN16VDB\WORD_IDX.DBF
Table type            DBASE
Number of records     129509
Last update           21.04.1999
------------------------------------------------------------------
Field  Field Name                 Type          Length  Dec  Index
    1  IDX_RECNO                  NUMERIC            6           N
    2  WORD                       CHARACTER         65           Y
    3  POS                        CHARACTER          1           N
    4  POLY_CNT                   NUMERIC            2           N
    5  SENSE_CNT                  NUMERIC            2           N
    6  TAGSNS_CNT                 NUMERIC            2           N
------------------------------------------------------------------
** Total **                                         79

Structure for table   C:\WN16VDB\WORD_SNS.DBF
Table type            DBASE
Number of records     173941
Last update           21.04.1999
------------------------------------------------------------------
Field  Field Name                 Type          Length  Dec  Index
    1  IDX_RECNO                  NUMERIC            6           Y
    2  SYNSET_ID                  CHARACTER          9           N
    3  POS                        CHARACTER          1           N
    4  SENSE_NR                   NUMERIC            2           N
------------------------------------------------------------------
** Total **                                         19

Structure for table   C:\WN16VDB\SYNSETS.DBF
Table type            DBASE
Number of records     99642
Last update           21.04.1999
------------------------------------------------------------------
Field  Field Name                 Type          Length  Dec  Index
    1  SYNSET_ID                  CHARACTER          9           Y
    2  POS                        CHARACTER          1           N
    3  SS_TYPE                    CHARACTER          1           N
    4  LEX_FILE                   NUMERIC            2           N
    5  W_CNT                      NUMERIC            2           N
    6  P_CNT                      NUMERIC            3           N
    7  F_CNT                      NUMERIC            2           N
    8  GLOSS                      CHARACTER        254           N
------------------------------------------------------------------
** Total **                                        275

Structure for table   C:\WN16VDB\WORDS.DBF
Table type            DBASE
Number of records     174008
Last update           21.04.1999
------------------------------------------------------------------
Field  Field Name                 Type          Length  Dec  Index
    1  SYNSET_ID                  CHARACTER          9           Y
    2  POS                        CHARACTER          1           N
    3  WORD                       CHARACTER         65           Y
    4  MARKER                     CHARACTER          2           N
    5  HEAD_WORD                  CHARACTER         19           N
    6  LEX_ID                     NUMERIC            2           N
------------------------------------------------------------------
** Total **                                         99

Structure for table   C:\WN16VDB\POINTERS.DBF
Table type            DBASE
Number of records     238452
Last update           21.04.1999
------------------------------------------------------------------
Field  Field Name                 Type          Length  Dec  Index
    1  PTR                        CHARACTER          2           N
    2  SYNSET_ID                  CHARACTER          9           Y
    3  POS                        CHARACTER          1           N
    4  S_WORD_NR                  NUMERIC            2           N
    5  TARGET_ID                  CHARACTER          9           N
    6  T_POS                      CHARACTER          1           N
    7  T_WORD_NR                  NUMERIC            2           N
------------------------------------------------------------------
** Total **                                         27

Structure for table   C:\WN16VDB\FRAMES.DBF
Table type            DBASE
Number of records     19543
Last update           21.04.1999
------------------------------------------------------------------
Field  Field Name                 Type          Length  Dec  Index
    1  SYNSET_ID                  CHARACTER          9           Y
    2  POS                        CHARACTER          1           N
    3  FRAME_NR                   NUMERIC            2           N
    4  WORD_NR                    NUMERIC            2           N
------------------------------------------------------------------
** Total **                                         15