StructSVM-SDM: Software for Sequential Dual Method for Structural SVMs

This page contains information about the software, StructSVM-SDM, which contains the implementation of a Sequential Dual Method for Structural SVMs.

A Sequential Dual Method for Structural SVMs

Many real-world applications involve prediction of structured objects like trees, graphs or sequences. Structural SVMs have become a popular tool for learning and efficiently predicting such structured objects.

Given a sample of structured inputs and outputs {(xi,yi)}i=1,2,..n (X x Y), Structural SVMs solve an optimization problem (OP1):

minw 12 ||w||2 + C Σi=1,2,…,n ξi

s.t. wT Δfi(y) Li(y) - ξi, i, yY.

Δfi(y)=f(xi,yi) - f(xi,y) is the difference feature vector and Li(y)=L(yi,y) is a suitable loss function.

The Sequential Dual Method solves a scaled dual problem of OP1 given as (OP2):

minα C2 || Σi,y αi(y) Δfi(y) ||2 - Σi,y αi(y) Li(y)

s.t. Σi,y αi(y) = 1 i, αi(y)0 i, yY.

The method works by sequentially visiting each example i and by solving a sub-problem of OP2 associated with that example.

Solving the sub-problem is usually difficult as it involves an exponential number of αi(y) variables.

The sequential dual method exploits the sparsity of the αi(y) variables and considers only a subset Vi = { y : αi(y) > 0} and solves the sub-problem with respect to Vi.

On typical data sets, the sequential dual method is an order of magnitude faster than state-of-the-art methods like the cutting-plane method.

Structural SVMs are used in various applications like sequence labeling, parse-tree classification, image segmentation etc.,

seqlabeling parsetree


Download

You can download the current version of StructSVM-SDM from the link structsvm_sdm.tar.gz

This software is available free only for non-commercial purposes. Please cite the paper mentioned below if you intend to use this program.

How to Cite

Please cite the following paper :

A Sequential Dual Method for Structural SVMs. P.Balamurugan, Shirish Shevade, S.Sundararajan, S.S.Keerthi. In SIAM International Conference on Data Mining (SDM), 2011.
Software available at http://drona.csa.iisc.ernet.in/~shirish/structsvm_sdm.html

The bibtex for the paper is :

@article{ShevadePSK11,
title = {A Sequential Dual Method for Structural SVMs},
author = {P. Balamurugan and Shirish Shevade and S. Sundararajan and S. Sathiya Keerthi},
journal = {Proceedings of the Eleventh SIAM International Conference on Data Mining},
year = {2011},
pages = {223-234},
note = {Software available at \url{http://www.drona.csa.iisc.ernet.in/~shirish/structsvm_sdm.html}}
}

Acknowledgment

The implementation uses the SVMStruct API used in Sequence Tagging and available at http://www.cs.cornell.edu/People/tj/svm_light/svm_hmm.html.

From SVMStruct implementation, the following files have been borrowed with minor modifications :

1. svm_common.h

2. svm_common.c

3. svm_learn.h

4. svm_struct_api.c

5. svm_struct_api.h

6. svm_struct_api_types.h

How To Use

Please follow the given steps to use the software. The software works currently in Linux.
For Windows machine, use Cygwin or MinGW software and follow Steps 1 to 8 to compile and run the code.
A version of the code for Mac OS will be made available soon.

1. Untar the archive structsvm_sdm.tar.gz using the command

tar xvf structsvm_sdm.tar.gz

under some path /USER_PATH(this depends on your machine).

2. Under the path /USER_PATH, a directory named structsvm_sdm is created. The full path of this directory is /USER_PATH/structsvm_sdm/.

3. Open a shell window.

4. Change directory using the following command on the shell prompt

cd /USER_PATH/structsvm_sdm/

5. Type the following command on the prompt

make

6. Three executables structsvm_sdm_learn, structsvm_sdm_classify and conll2svmlightconverter are created under the directory /USR_PATH/structsvm_sdm/.

7. Type the following in the shell prompt to export PATH variable:

export PATH=$PATH:/USER_PATH/structsvm_sdm/

8. Running the SDM program:

For learning/training phase, type the following in the shell prompt

structsvm_sdm_learn [options] train_file model_file test_file

Note:

a) options are given below. If you do not provide any options, the default parameters will be used.

b) train_file, model_file and test_file are to be specified with the full path

c) model_file is an optional second argument and if a model file is not specified, the model gets written to a default file named structsvm_sdm_struct_model in the structsvm_sdm directory.

d) test_file is an optional third argument, and if a test file is not specified, no testing is done. Please note that test file cannot be given as the second argument.

To know about learn/train options, please type

structsvm_sdm_learn -?

The options available are given below :

General Options:

-? → help
-v [0..3] → verbosity level (default 1)


Learning Options:

-c float → C: trade-off between training error and margin (default 0.1)
-l [1]


→ Loss function to use. (default 1)
1: zero/one loss per token (Hamming Loss)
(user could incorporate zero/one loss per sequence)


SDM Optimization Options:

-w {0,1}


→ choice of SDM learning algorithm (default 1):
0: baseline SDM algorithm without heuristics
1: SDM Algorithm with heuristics
-e float → max Psi tol : tolerance for SDM algorithm termination (default 0.25)
-k [1..] → kappa : No of first GetMaxY iterations (default 10)
-n [1..] → Non-GetMaxY cap : Interleave GetMaxY iterations with these many non-GetMaxY iterations (default 5)
-M [1..] → Max no of SDM iterations (default 100000)


SMO Optimization Options:

-s float → Max SMO tolerance : Terminate SMO on achieving this tolerance (default 0.15)
-$ [1..] → Max no of SMO iterations (default 100000)


For classifying a test file using an already available model file, use the following command

structsvm_sdm_classify [options] test_file model_file

Classify Options:

-h → help
-v [0..3] → verbosity level (default 1)


Running SDM on example Data

The structsvm_sdm directory contains a sub-folder data in which a sample data file sampledata is given.

You can try running SDM on this data by typing the following command:

structsvm_sdm_learn data/sampledata

You should get a primal objective value ≈ 14.59 and a dual objective value ≈ 14.51.

The performance on the sample train data is given in the graph below:

train perf

Description of feature vector used in the implementation

For sequence labeling application, the input is made of parts x=(x1,x2,…,xT) and the output is also made of corresponding parts y=(y1,y2,…,yT). The feature vector for the input (x,y) given by f(x,y) is constructed using the first-order features and second-order features. The first-order features are obtained by placing the features xt at yt-th position. The second-order features are obtained by considering the label assignment to the label pair yt-yt-1 at every stage t=1,2,…,T.

Input File Format

A). The input file format should be of SVMStruct format.

example :
4 qid:1 5:-1 345:0.3 2101:0.89
6 qid:1 1:-1.5 5:0.2 39:1.98
3 qid:2 567:-0.24 689:0.8 792:0.21 

In case your data has the CoNLL text chunking format, please go to Step. B.

If the input file format is different from these formats, then the reader module has to be changed.

All other modules which assume this format should also be changed.

B). CoNLL text chunking data has the following form :

But CC O
we PRP B-NP
certainly RB O
like IN O
what WP B-NP
we PRP B-NP
've VBP O
seen VBN O
so RB O
far RB O
. . O
" " O 

This data has a column-format (number of columns=3) and a suitable template file is required to extract features from this data.

The template file contains template rules of the following form :

U00:%x[-2,0] U01:%x[-1,0] U02:%x[0,0] U03:%x[1,0] U04:%x[2,0] U05:%x[-1,0]/%x[0,0] U06:%x[0,0]/%x[1,0] U10:%x[-2,1] U11:%x[-1,1] U12:%x[0,1] U13:%x[1,1] U14:%x[2,1] U15:%x[-2,1]/%x[-1,1] U16:%x[-1,1]/%x[0,1] U17:%x[0,1]/%x[1,1] U18:%x[1,1]/%x[2,1] U20:%x[-2,1]/%x[-1,1]/%x[0,1] U21:%x[-1,1]/%x[0,1]/%x[1,1] U22:%x[0,1]/%x[1,1]/%x[2,1]

B

Similar templates are available for other Bio-datasets like BioCreative and BioNLP, the data of which is also in the column format.

For such datasets, please use the utility conll2svmlightconverter by typing the following in the shell prompt:

conll2svmlightconverter template_file train_file test_file validation_file

This will extract the features in SVMStruct format in a new file, the name for which is displayed in the shell prompt.

To know more about this utility, just type conll2svmlightconverter in the shell prompt.

Once you get the features in SVMStruct format, you can start training as described in Step. 8.

The conll2svmlightconverter tool is modelled similar to the feature generation procedure in CRFSGD by Leon Bottou.

Contact

In case you find any problems or have any questions related to this software, please mail Balamurugan.