NEXUS Files
Reading NEXUS data
Since NEXUS is an Extensible File Format, it’s natural habitat is the file system. Thus, to
instantiate a Nexus
object, we typically read a file to access
NEXUS data:
>>> from commonnexus import Nexus
>>> nex = Nexus.from_file('tests/fixtures/ape_random.trees')
>>> for name in nex.blocks:
... print(name)
...
TAXA
TREES
- class commonnexus.nexus.Config(hyphenminus_is_text=True, asterisk_is_text=True, validate_newick=False, ignore_unsupported=True, encoding='utf8', no_default_matchchar=False, strict=False)[source]
The global behaviour of a
Nexus
instance can be configured. The available configuration options are set and accessed from an instance of Config.- Parameters:
hyphenminus_is_text (
bool
) –asterisk_is_text (
bool
) –validate_newick (
bool
) –ignore_unsupported (
bool
) –encoding (
str
) –no_default_matchchar (
bool
) –strict (
bool
) –
-
hyphenminus_is_text:
bool
= True Specifies whether “-”, aka ASCII hyphen-minus, is considered punctuation or not.
-
asterisk_is_text:
bool
= True Specifies whether “*”, aka asterisk, is considered punctuation or not.
-
validate_newick:
bool
= False Specifies whether Newick nodes for TREEs are constructed by parsing the Newick string or from the Nexus tokens. The latter is slightly faster but will bypass some input validation.
-
ignore_unsupported:
bool
= True Specifies whether unsupported NEXUS commands/options are ignored or raise an error. Note that the effect of this option may only set in when a block or command is accessed.
-
encoding:
str
= 'utf8' Specifies the text encoding of a NEXUS file.
-
no_default_matchchar:
bool
= False The NEXUS spec does not explicitly state a default value for the MATCHCHAR directive in the FORMAT command of a CHARACTERS block. commonnexus - in agreement with many NEXUS files encountered “in the wild” - assumes a default of “.”. To force no default value for MATCHCHAR, e.g. because matrix data uses “.” as regular state symbol, set no_default_matchchar to True.
-
strict:
bool
= False Sometimes the NEXUS spec is not followed entirely by files found in the wild. If somewhat lax interpretation does not lead to ambiguities, that’s what commonnexus does. To force stricter adherence to the spec, set strict to True.
- class commonnexus.nexus.Nexus(s=None, block_implementations=None, config=None, **kw)[source]
A NEXUS object implemented as list of commands with methods to read and write blocks.
From the spec:
The tokens in a NEXUS file are organized into commands, which are in turn organized into blocks.
This is reflected in the
Nexus
object. TheNexus
object is just alist
ofCommands
, and has a propertyNexus.blocks()
giving access to commands grouped by block:>>> nex = Nexus('#NEXUS BEGIN myblock; mycmd a b c; END;') >>> nex[0].__class__ <class 'commonnexus.nexus.Command'> >>> len(nex.blocks['MYBLOCK']) 1
Note
NEXUS is for the most part case-insensitive. commonnexus reflects this by giving all blocks and commands uppercase names. Thus, even if a command or block has a lowercase or mixed-case name in the file, the corresponding
Command
orBlock
object must be addressed using the uppercase name.- Parameters:
s (
typing.Union
[typing.Iterable
,typing.List
[commonnexus.command.Command
],None
]) –block_implementations (
typing.Optional
[typing.Dict
[str
,commonnexus.blocks.base.Block
]]) –config (
typing.Optional
[commonnexus.nexus.Config
]) –
- __init__(s=None, block_implementations=None, config=None, **kw)[source]
- Parameters:
s (
typing.Union
[typing.Iterable
,typing.List
[commonnexus.command.Command
],None
]) – The NEXUS content.block_implementations (
typing.Optional
[typing.Dict
[str
,commonnexus.blocks.base.Block
]]) – Custom implementations for non-public blocks.config (
typing.Optional
[commonnexus.nexus.Config
]) – Configuration.kw – If no
Config
object is passed as config, keyword parameters will be interpreted as configuration options. Thus,
>>> nex = Nexus(encoding='latin')
is a shortcut for
>>> nex = Nexus(config=Config(encoding='latin'))
- classmethod from_file(p, config=None, **kw)[source]
Instantiate a Nexus object from the contents of a NEXUS file.
- Parameters:
p (
typing.Union
[str
,pathlib.Path
]) – Path of the file.config (
typing.Optional
[commonnexus.nexus.Config
]) – An optional configuration object.kw – Configuration options, if no Config object is passed in.
- Return type:
- Returns:
A Nexus instance.
- property blocks: Dict[str, List[Block]]
A dict mapping uppercase block names to lists of instances of these blocks ordered as they appear in the NEXUS content.
For a shortcut to access blocks which are known to appear just once in the NEXUS content, see
Nexus.__getattribute__()
.
- __getattribute__(name)[source]
NEXUS does not make any prescriptions regarding how many blocks with the same name may exist in a file. Thus, the primary way to access blocks is by looking up the list of blocks for a given name in
Nexus.blocks()
. If it can be assumed that just one block for a name exists, or only the first block with that name is of interest, this block can also be accessed as Nexus.<BLOCK_NAME>, i.e. using the uppercase block name as attribute of the Nexus instance.>>> nex = Nexus('#NEXUS begin block; cmd; end;') >>> nex.BLOCK.name 'BLOCK' >>> len(nex.BLOCK.commands) 1
- __str__()[source]
The string representation of a Nexus object is just its NEXUS content.
>>> nex = Nexus() >>> nex.append_block(Block.from_commands([])) >>> print(nex) #NEXUS BEGIN BLOCK; END;
- to_file(p)[source]
Write the NEXUS content of a Nexus object to a file.
- Parameters:
p (
typing.Union
[str
,pathlib.Path
]) –
- property comments: List[str]
Comments may appear anywhere in a NEXUS file. Thus, they are the only kind of tokens not really grouped into a command.
While comments in commands can also be accessed from the command, comments preceding any command (and all others) can accessed via this property.
>>> nex = Nexus("#nexus [created by commonnexus] begin block; cmd [does nothing]; end;") >>> nex.BLOCK.CMD.comments ['does nothing'] >>> nex.comments[0] 'created by commonnexus'
- get_numbers(object_name, items)[source]
Determine object numbers suitable for inclusion in a set spec.
- resolve_set_spec(object_name, spec, chars=None)[source]
Resolve a set spec to a list of included items, specified by label or number.
- Parameters:
object_name –
spec –
- Returns:
- __weakref__
list of weak references to the object (if defined)
- property characters: Block | None
Shortcut to get around the DATA/CHARACTERS ambiguity.
I.e. if one is interested in the characters matrix of a NEXUS file no matter whether this is included in a DATA or CHARACTERS block,
Nexus.characters.get_matrix()
can be used rather than(Nexus.DATA or NEXUS.CHARACTERS).get_matrix()
.- Returns:
The first DATA or CHARACTERS block.
- property taxa: List[str] | None
Shortcut to retrieve the list of taxa a NEXUS file provides data on.
- Returns:
The list of taxa labels used in a NEXUS file.
Note
There are various ways to encode taxa labels in a NEXUS file. This method looks up different places ordered by explicitness, i.e.
A TAXLABELS command in a TAXA block.
A TAXLABELS command in a DATA or CHARACTERS block.
Taxa labels given implicitly as labels in a MATRIX command.
A TAXLABELS command in a DISTANCES block.
Taxa labels given implicitly as labels in a DISTANCES.MATRIX command.
Taxa labels given as mappings in the TRANSLATE command of a TREES block.
Taxa labels given implicitly as node names in the Newick representation of a tree in a TREE command in a TREES block.
Warning
Taxa descriptions in NEXUS may be inconsistent, e.g. a NEXUS file might contain a TAXA block, but introduce new taxa via NEWTAXA/TAXLABELS in a CHARACTERS block. commonnexus does not make an effort to check for consistency.
Writing NEXUS data
commonnexus provides functionality to write NEXUS by manipulating commonnexus.nexus.Nexus
objects, which can then be written to a file.
>>> nex = Nexus()
>>> nex.to_file('test.nex')
will write a minimal NEXUS file containing just the text #NEXUS
.
Since blocks are the somewhat self-contained units of information in NEXUS, the main ways to
manipulate a Nexus
object are
- Nexus.replace_block(old, new)[source]
- Parameters:
old (
commonnexus.blocks.base.Block
) –new (
typing.Union
[commonnexus.blocks.base.Block
,typing.List
[typing.Tuple
[str
,str
]]]) –
The methods to add blocks accept Block
instances as argument. Such instances can be
obtained by calling the generic factory method
- classmethod Block.from_commands(commands, nexus=None, name=None, comment=None, TITLE=None, LINK=None, ID=None)[source]
Generic factory method for blocks.
This method will create a block with the uppercase name of the
cls
as name (or the explicitly passed blockname
). The (name str, payload str) tuples fromcommands
are simply passed tocommonnexus.command.Command.from_name_and_payload()
to assemble the commands in the block.This method should be used to create custom, non-public NEXUS blocks, while for public blocks the
from_data
method of the class implementing the block should be preferred, because the latter will make sure that consistent, valid block data is written.- Parameters:
commands (
typing.Iterable
[typing.Union
[str
,typing.Tuple
[str
,str
],typing.Tuple
[str
,str
,str
]]]) – The commands to be inserted in the body of the block. A command can be specified as single string, which is taken as the name of the command, a pair (name, payload) or a triple (name, payload, comment).nexus (
typing.Optional
[commonnexus.nexus.Nexus
]) – A Nexus instance to lookup global config options.name (
typing.Optional
[str
]) – Explicit name of the block to be created.
- Return type:
commonnexus.blocks.base.Block
- Returns:
The instantiated Block object.
>>> from commonnexus import Nexus, Block >>> nex = Nexus() >>> nex.append_block(Block.from_commands([('mycommand', 'with data')], name='myblock')) >>> print(nex) #NEXUS BEGIN myblock; mycommand with data; END; >>> str(nex.MYBLOCK.MYCOMMAND) 'with data'
- Parameters:
comment (
typing.Optional
[str
]) –TITLE (
typing.Optional
[str
]) –LINK (
typing.Optional
[str
]) –ID (
typing.Optional
[str
]) –
or specific implementations of Block.from_data, such as
commonnexus.blocks.characters.Characters.from_data()
or
commonnexus.blocks.trees.Trees.from_data()
Comments
The
from_data
methods of blocks accept a keyword argumentcomment
to add a comment to a block construct.To add a comment to the top of a NEXUS file, one can proceed as follows: