Python-Ref > Cheminformatics > OpenBabel > Querying available formats
 
 

<-^^->
Klíčová slova
Moduly
Knihovní funkce

Querying available formats

How to find out which formats OpenBabel supports.
When running OpenBabel from the command line, we are presented with a list of formats that are supported by OpenBabel.
Expand/Shrink
Zdroj: (babel13-1.py)
  1   import os
  2   
  3   os.system( "babel -H")
stdout:
Open Babel converts chemical structures from one file format to another

Usage: babel <input spec> <output spec> [Options]

Each spec can be a file whose extension decides the format.
Optionally the format can be specified by preceding the file by
-i<format-type> e.g. -icml, for input and -o<format-type> for output

See below for available format-types, which are the same as the 
file extensions and are case independent.
If no input or output file is given stdin or stdout are used instead.

More than one input file can be specified and their names can contain
wildcard chars (* and ?).The molecules are aggregated in the output file.

Conversion options
-f <#> Start import at molecule # specified
-l <#> End import at molecule # specified
-e Continue with next object after error, if possible
-z Compress the output with gzip
-k Attempt to translate keywords
-H Outputs this help text
-Hxxx (xxx is file format ID e.g. -Hcml) gives format info
-Hall Outputs details of all formats
-V Outputs version number
-L <BaseType> Lists plugin classes of this type
    e.g. <fingerprints>, or <plugins> for a list of BaseTypes
-m Produces multiple output files, to allow:
   Splitting: e.g.        babel infile.mol new.smi -m
     puts each molecule into new1.smi new2.smi etc
   Batch conversion: e.g. babel *.mol -osmi -m
     converts each input file to a .smi file
For conversions of molecules
Additional options :
-d Delete hydrogens (make implicit)
-h Add hydrogens (make explicit)
-p Add Hydrogens appropriate for pH model
-b Convert dative bonds e.g.[N+]([O-])=O to N(=O)=O
-c Center Coordinates
-C Combine mols in first file with others having same name
--filter <filterstring> Filter: convert only when tests are true:
--add <list> Add properties from descriptors:
--delete <list> Delete properties in list:
--append <list> Appends properties or descriptors in list to title:
-s"smarts" Convert only molecules matching SMARTS:
-v"smarts" Convert only molecules NOT matching SMARTS:
--join Join all input molecules into a single output molecule
--separate Output disconnected fragments separately
--property <attrib> <value> add or replace a property (SDF)
--title <title> Add or replace molecule title
--addtotitle <text> Append to title
--addformula Append formula to title
--AddPolarH Adds hydrogen to polar atoms only
--center Centers coordinates around (0,0,0)
--gen3D Generate 3D coordinates
--nodative Replace [N+]([O-])=O by N(=O)=O

Interface to OBAPI internals
 API options, e.g. ---errorlevel 2
  errorlevel # min warning level displayed
 
The following file formats are recognized:
acr -- ACR format [Read-only]
adf -- ADF cartesian input format [Write-only]
adfout -- ADF output format [Read-only]
alc -- Alchemy format
arc -- Accelrys/MSI Biosym/Insight II CAR format [Read-only]
bgf -- MSI BGF format
box -- Dock 3.5 Box format
bs -- Ball and Stick format
c3d1 -- Chem3D Cartesian 1 format
c3d2 -- Chem3D Cartesian 2 format
cac -- CAChe MolStruct format [Write-only]
caccrt -- Cacao Cartesian format
cache -- CAChe MolStruct format [Write-only]
cacint -- Cacao Internal format [Write-only]
can -- Canonical SMILES format.
car -- Accelrys/MSI Biosym/Insight II CAR format [Read-only]
ccc -- CCC format [Read-only]
cdx -- ChemDraw binary format [Read-only]
cdxml --  ChemDraw CDXML format 
cht -- Chemtool format [Write-only]
cif -- Crystallographic Information File
ck -- ChemKin format
cml -- Chemical Markup Language
cmlr -- CML Reaction format
com -- Gaussian 98/03 Input [Write-only]
copy -- Copies raw text [Write-only]
crk2d -- Chemical Resource Kit diagram(2D)
crk3d -- Chemical Resource Kit 3D format
csr -- Accelrys/MSI Quanta CSR format [Write-only]
cssr -- CSD CSSR format [Write-only]
ct -- ChemDraw Connection Table format 
cub -- Gaussian cube format
cube -- Gaussian cube format
dmol -- DMol3 coordinates format
dx -- Gaussian cube format
ent -- Protein Data Bank format
fa -- FASTA format [Write-only]
fasta -- FASTA format [Write-only]
fch -- Gaussian formatted checkpoint file format [Read-only]
fchk -- Gaussian formatted checkpoint file format [Read-only]
fck -- Gaussian formatted checkpoint file format [Read-only]
feat -- Feature format
fh -- Fenske-Hall Z-Matrix format [Write-only]
fix -- SMILES FIX format [Write-only]
fpt -- Fingerprint format [Write-only]
fract -- Free Form Fractional format
fs -- FastSearching
fsa -- FASTA format [Write-only]
g03 -- Gaussian98/03 Output [Read-only]
g92 -- Gaussian98/03 Output [Read-only]
g94 -- Gaussian98/03 Output [Read-only]
g98 -- Gaussian98/03 Output [Read-only]
gal -- Gaussian98/03 Output [Read-only]
gam -- GAMESS Output [Read-only]
gamin -- GAMESS Input
gamout -- GAMESS Output [Read-only]
gau -- Gaussian 98/03 Input [Write-only]
gjc -- Gaussian 98/03 Input [Write-only]
gjf -- Gaussian 98/03 Input [Write-only]
gpr -- Ghemical format
gr96 -- GROMOS96 format [Write-only]
gukin -- GAMESS-UK Input
gukout -- GAMESS-UK Output
gzmat -- Gaussian Z-Matrix Input
hin -- HyperChem HIN format
inchi -- InChI format
inp -- GAMESS Input
ins -- ShelX format [Read-only]
jin -- Jaguar input format [Write-only]
jout -- Jaguar output format [Read-only]
k -- Compare molecules using InChI [Write-only]
mcdl -- MCDL format
mcif -- Macromolecular Crystallographic Information
mdl -- MDL MOL format
ml2 -- Sybyl Mol2 format
mmcif -- Macromolecular Crystallographic Information
mmd -- MacroModel format
mmod -- MacroModel format
mol -- MDL MOL format
mol2 -- Sybyl Mol2 format
molden -- Molden input format [Read-only]
molreport -- Open Babel molecule report [Write-only]
moo -- MOPAC Output format [Read-only]
mop -- MOPAC Cartesian format
mopcrt -- MOPAC Cartesian format
mopin -- MOPAC Internal
mopout -- MOPAC Output format [Read-only]
mpc -- MOPAC Cartesian format
mpd -- Sybyl descriptor format [Write-only]
mpqc -- MPQC output format [Read-only]
mpqcin -- MPQC simplified input format [Write-only]
msi -- Accelrys/MSI Cerius II MSI format [Read-only]
msms -- M.F. Sanner's MSMS input format [Write-only]
nw -- NWChem input format [Write-only]
nwo -- NWChem output format [Read-only]
outmol -- DMol3 coordinates format
pc --  PubChem format  [Read-only]
pcm -- PCModel Format
pdb -- Protein Data Bank format
png -- PNG files with embedded data
pov -- POV-Ray input format [Write-only]
pqr -- PQR format
pqs -- Parallel Quantum Solutions format
prep -- Amber Prep format [Read-only]
qcin -- Q-Chem input format [Write-only]
qcout -- Q-Chem output format [Read-only]
report -- Open Babel report format [Write-only]
res -- ShelX format [Read-only]
rsmi -- Reaction SMILES format
rxn -- MDL RXN format
sd -- MDL MOL format
sdf -- MDL MOL format
smi -- SMILES format
smiles -- SMILES format
sy2 -- Sybyl Mol2 format
t41 -- ADF TAPE41 format [Read-only]
tdd -- Thermo format
test -- Test format [Write-only]
therm -- Thermo format
tmol -- TurboMole Coordinate format
txt -- Title format
txyz -- Tinker MM2 format [Write-only]
unixyz -- UniChem XYZ format
vmol -- ViewMol format
xed -- XED format [Write-only]
xml --  General XML format [Read-only]
xtc -- XTC format [Read-only]
xyz -- XYZ cartesian coordinates format
yob -- YASARA.org YOB format
zin -- ZINDO input format [Write-only]

See further specific info and options using -H<format-type>, e.g. -Hcml
Doba běhu: 35.0 ms
However, what we need is a way to query this information from the openbabel library, not the babel executable. The following code shows how to do it.
Expand/Shrink
Zdroj: (babel13-2.py)
  1   import openbabel
  2   
  3   conv = openbabel.OBConversion()
  4   print "---------- Input formats ----------"
  5   for format in conv.GetSupportedInputFormat():
  6     print format
  7   print
  8   print "---------- Output formats ----------"
  9   for format in conv.GetSupportedOutputFormat():
 10     print format
stdout:
---------- Input formats ----------
acr -- ACR format
adfout -- ADF output format
alc -- Alchemy format
arc -- Accelrys/MSI Biosym/Insight II CAR format
bgf -- MSI BGF format
box -- Dock 3.5 Box format
bs -- Ball and Stick format
c3d1 -- Chem3D Cartesian 1 format
c3d2 -- Chem3D Cartesian 2 format
caccrt -- Cacao Cartesian format
can -- Canonical SMILES format.
car -- Accelrys/MSI Biosym/Insight II CAR format
ccc -- CCC format
cdx -- ChemDraw binary format
cdxml --  ChemDraw CDXML format 
cif -- Crystallographic Information File
ck -- ChemKin format
cml -- Chemical Markup Language
cmlr -- CML Reaction format
crk2d -- Chemical Resource Kit diagram(2D)
crk3d -- Chemical Resource Kit 3D format
ct -- ChemDraw Connection Table format 
cub -- Gaussian cube format
cube -- Gaussian cube format
dmol -- DMol3 coordinates format
dx -- Gaussian cube format
ent -- Protein Data Bank format
fch -- Gaussian formatted checkpoint file format
fchk -- Gaussian formatted checkpoint file format
fck -- Gaussian formatted checkpoint file format
feat -- Feature format
fract -- Free Form Fractional format
fs -- FastSearching
g03 -- Gaussian98/03 Output
g92 -- Gaussian98/03 Output
g94 -- Gaussian98/03 Output
g98 -- Gaussian98/03 Output
gal -- Gaussian98/03 Output
gam -- GAMESS Output
gamin -- GAMESS Input
gamout -- GAMESS Output
gpr -- Ghemical format
gukin -- GAMESS-UK Input
gukout -- GAMESS-UK Output
gzmat -- Gaussian Z-Matrix Input
hin -- HyperChem HIN format
inchi -- InChI format
inp -- GAMESS Input
ins -- ShelX format
jout -- Jaguar output format
mcdl -- MCDL format
mcif -- Macromolecular Crystallographic Information
mdl -- MDL MOL format
ml2 -- Sybyl Mol2 format
mmcif -- Macromolecular Crystallographic Information
mmd -- MacroModel format
mmod -- MacroModel format
mol -- MDL MOL format
mol2 -- Sybyl Mol2 format
molden -- Molden input format
moo -- MOPAC Output format
mop -- MOPAC Cartesian format
mopcrt -- MOPAC Cartesian format
mopin -- MOPAC Internal
mopout -- MOPAC Output format
mpc -- MOPAC Cartesian format
mpqc -- MPQC output format
msi -- Accelrys/MSI Cerius II MSI format
nwo -- NWChem output format
outmol -- DMol3 coordinates format
pc --  PubChem format 
pcm -- PCModel Format
pdb -- Protein Data Bank format
png -- PNG files with embedded data
pqr -- PQR format
pqs -- Parallel Quantum Solutions format
prep -- Amber Prep format
qcout -- Q-Chem output format
res -- ShelX format
rsmi -- Reaction SMILES format
rxn -- MDL RXN format
sd -- MDL MOL format
sdf -- MDL MOL format
smi -- SMILES format
smiles -- SMILES format
sy2 -- Sybyl Mol2 format
t41 -- ADF TAPE41 format
tdd -- Thermo format
therm -- Thermo format
tmol -- TurboMole Coordinate format
txt -- Title format
unixyz -- UniChem XYZ format
vmol -- ViewMol format
xml --  General XML format
xtc -- XTC format
xyz -- XYZ cartesian coordinates format
yob -- YASARA.org YOB format

---------- Output formats ----------
adf -- ADF cartesian input format
alc -- Alchemy format
bgf -- MSI BGF format
box -- Dock 3.5 Box format
bs -- Ball and Stick format
c3d1 -- Chem3D Cartesian 1 format
c3d2 -- Chem3D Cartesian 2 format
cac -- CAChe MolStruct format
caccrt -- Cacao Cartesian format
cache -- CAChe MolStruct format
cacint -- Cacao Internal format
can -- Canonical SMILES format.
cdxml --  ChemDraw CDXML format 
cht -- Chemtool format
cif -- Crystallographic Information File
ck -- ChemKin format
cml -- Chemical Markup Language
cmlr -- CML Reaction format
com -- Gaussian 98/03 Input
copy -- Copies raw text
crk2d -- Chemical Resource Kit diagram(2D)
crk3d -- Chemical Resource Kit 3D format
csr -- Accelrys/MSI Quanta CSR format
cssr -- CSD CSSR format
ct -- ChemDraw Connection Table format 
cub -- Gaussian cube format
cube -- Gaussian cube format
dmol -- DMol3 coordinates format
dx -- Gaussian cube format
ent -- Protein Data Bank format
fa -- FASTA format
fasta -- FASTA format
feat -- Feature format
fh -- Fenske-Hall Z-Matrix format
fix -- SMILES FIX format
fpt -- Fingerprint format
fract -- Free Form Fractional format
fs -- FastSearching
fsa -- FASTA format
gamin -- GAMESS Input
gau -- Gaussian 98/03 Input
gjc -- Gaussian 98/03 Input
gjf -- Gaussian 98/03 Input
gpr -- Ghemical format
gr96 -- GROMOS96 format
gukin -- GAMESS-UK Input
gukout -- GAMESS-UK Output
gzmat -- Gaussian Z-Matrix Input
hin -- HyperChem HIN format
inchi -- InChI format
inp -- GAMESS Input
jin -- Jaguar input format
k -- Compare molecules using InChI
mcdl -- MCDL format
mcif -- Macromolecular Crystallographic Information
mdl -- MDL MOL format
ml2 -- Sybyl Mol2 format
mmcif -- Macromolecular Crystallographic Information
mmd -- MacroModel format
mmod -- MacroModel format
mol -- MDL MOL format
mol2 -- Sybyl Mol2 format
molreport -- Open Babel molecule report
mop -- MOPAC Cartesian format
mopcrt -- MOPAC Cartesian format
mopin -- MOPAC Internal
mpc -- MOPAC Cartesian format
mpd -- Sybyl descriptor format
mpqcin -- MPQC simplified input format
msms -- M.F. Sanner's MSMS input format
nw -- NWChem input format
outmol -- DMol3 coordinates format
pcm -- PCModel Format
pdb -- Protein Data Bank format
png -- PNG files with embedded data
pov -- POV-Ray input format
pqr -- PQR format
pqs -- Parallel Quantum Solutions format
qcin -- Q-Chem input format
report -- Open Babel report format
rsmi -- Reaction SMILES format
rxn -- MDL RXN format
sd -- MDL MOL format
sdf -- MDL MOL format
smi -- SMILES format
smiles -- SMILES format
sy2 -- Sybyl Mol2 format
tdd -- Thermo format
test -- Test format
therm -- Thermo format
tmol -- TurboMole Coordinate format
txt -- Title format
txyz -- Tinker MM2 format
unixyz -- UniChem XYZ format
vmol -- ViewMol format
xed -- XED format
xyz -- XYZ cartesian coordinates format
yob -- YASARA.org YOB format
zin -- ZINDO input format
Doba běhu: 81.9 ms
Expand/Shrink
Zdroj: (babel13-3.py)
  1   import openbabel
  2   
  3   conv = openbabel.OBConversion()
  4   # codes of supported input formats
  5   codes = [f.split(" -- ")[0] for f in conv.GetSupportedInputFormat()]
  6   print codes
  7   
  8   # map of supported output formats
  9   codes = [f.split(" -- ") for f in conv.GetSupportedOutputFormat()]
 10   code2format = dict( codes)
 11   print code2format
stdout:
['acr', 'adfout', 'alc', 'arc', 'bgf', 'box', 'bs', 'c3d1', 'c3d2', 'caccrt', 'can', 'car', 'ccc', 'cdx', 'cdxml', 'cif', 'ck', 'cml', 'cmlr', 'crk2d', 'crk3d', 'ct', 'cub', 'cube', 'dmol', 'dx', 'ent', 'fch', 'fchk', 'fck', 'feat', 'fract', 'fs', 'g03', 'g92', 'g94', 'g98', 'gal', 'gam', 'gamin', 'gamout', 'gpr', 'gukin', 'gukout', 'gzmat', 'hin', 'inchi', 'inp', 'ins', 'jout', 'mcdl', 'mcif', 'mdl', 'ml2', 'mmcif', 'mmd', 'mmod', 'mol', 'mol2', 'molden', 'moo', 'mop', 'mopcrt', 'mopin', 'mopout', 'mpc', 'mpqc', 'msi', 'nwo', 'outmol', 'pc', 'pcm', 'pdb', 'png', 'pqr', 'pqs', 'prep', 'qcout', 'res', 'rsmi', 'rxn', 'sd', 'sdf', 'smi', 'smiles', 'sy2', 't41', 'tdd', 'therm', 'tmol', 'txt', 'unixyz', 'vmol', 'xml', 'xtc', 'xyz', 'yob']
{'xed': 'XED format', 'cssr': 'CSD CSSR format', 'txyz': 'Tinker MM2 format', 'alc': 'Alchemy format', 'report': 'Open Babel report format', 'feat': 'Feature format', 'jin': 'Jaguar input format', 'fix': 'SMILES FIX format', 'rsmi': 'Reaction SMILES format', 'adf': 'ADF cartesian input format', 'pov': 'POV-Ray input format', 'cub': 'Gaussian cube format', 'pcm': 'PCModel Format', 'mopin': 'MOPAC Internal', 'mpqcin': 'MPQC simplified input format', 'mopcrt': 'MOPAC Cartesian format', 'mpd': 'Sybyl descriptor format', 'cube': 'Gaussian cube format', 'mpc': 'MOPAC Cartesian format', 'mop': 'MOPAC Cartesian format', 'dx': 'Gaussian cube format', 'mol': 'MDL MOL format', 'inchi': 'InChI format', 'hin': 'HyperChem HIN format', 'cml': 'Chemical Markup Language', 'gjf': 'Gaussian 98/03 Input', 'csr': 'Accelrys/MSI Quanta CSR format', 'gjc': 'Gaussian 98/03 Input', 'mdl': 'MDL MOL format', 'unixyz': 'UniChem XYZ format', 'gzmat': 'Gaussian Z-Matrix Input', 'crk3d': 'Chemical Resource Kit 3D format', 'cacint': 'Cacao Internal format', 'tdd': 'Thermo format', 'mmod': 'MacroModel format', 'bs': 'Ball and Stick format', 'mmd': 'MacroModel format', 'box': 'Dock 3.5 Box format', 'bgf': 'MSI BGF format', 'k': 'Compare molecules using InChI', 'vmol': 'ViewMol format', 'molreport': 'Open Babel molecule report', 'crk2d': 'Chemical Resource Kit diagram(2D)', 'gr96': 'GROMOS96 format', 'com': 'Gaussian 98/03 Input', 'pdb': 'Protein Data Bank format', 'ck': 'ChemKin format', 'cache': 'CAChe MolStruct format', 'c3d2': 'Chem3D Cartesian 2 format', 'xyz': 'XYZ cartesian coordinates format', 'c3d1': 'Chem3D Cartesian 1 format', 'mmcif': 'Macromolecular Crystallographic Information', 'txt': 'Title format', 'ct': 'ChemDraw Connection Table format ', 'therm': 'Thermo format', 'dmol': 'DMol3 coordinates format', 'ml2': 'Sybyl Mol2 format', 'fract': 'Free Form Fractional format', 'cht': 'Chemtool format', 'zin': 'ZINDO input format', 'cdxml': ' ChemDraw CDXML format ', 'gpr': 'Ghemical format', 'gau': 'Gaussian 98/03 Input', 'sdf': 'MDL MOL format', 'gukin': 'GAMESS-UK Input', 'cmlr': 'CML Reaction format', 'copy': 'Copies raw text', 'tmol': 'TurboMole Coordinate format', 'png': 'PNG files with embedded data', 'cif': 'Crystallographic Information File', 'outmol': 'DMol3 coordinates format', 'mcif': 'Macromolecular Crystallographic Information', 'smi': 'SMILES format', 'can': 'Canonical SMILES format.', 'cac': 'CAChe MolStruct format', 'caccrt': 'Cacao Cartesian format', 'qcin': 'Q-Chem input format', 'inp': 'GAMESS Input', 'gukout': 'GAMESS-UK Output', 'sy2': 'Sybyl Mol2 format', 'fasta': 'FASTA format', 'msms': "M.F. Sanner's MSMS input format", 'yob': 'YASARA.org YOB format', 'mcdl': 'MCDL format', 'fpt': 'Fingerprint format', 'ent': 'Protein Data Bank format', 'test': 'Test format', 'nw': 'NWChem input format', 'smiles': 'SMILES format', 'fs': 'FastSearching', 'mol2': 'Sybyl Mol2 format', 'fa': 'FASTA format', 'pqr': 'PQR format', 'pqs': 'Parallel Quantum Solutions format', 'fh': 'Fenske-Hall Z-Matrix format', 'fsa': 'FASTA format', 'gamin': 'GAMESS Input', 'rxn': 'MDL RXN format', 'sd': 'MDL MOL format'}
Doba běhu: 59.0 ms