Load data into TriFusion

TriFusion deals with different types and formats of input files, depending on which module you want to use. The Orthology module deals with proteomes and group files while the Process and Statistics module deals with alignment files. Regardless, input files are loaded into the application mostly in the same way (see How to load data into the app below).

Input types and formats

Orthology - explore

Group files are one of the outputs of the Orthology search operation and the input of the Orthology explore operation. These are simple text files that contain all ortholog groups identified in the search operation by OrthoMCL:

Ortholog1: Afumigatus_proteins|433 Anidulans_proteins|4605 (...)
Ortholog2: Afumigatus_proteins|3278 Afumigatus_proteins|9183 (...)
Ortholog3: Anidulans_proteins|36 Anidulans_proteins|9893 (...)
(...)

Each line contains the name of the ortholog group and a list of sequence references separated by whitespace. Each reference (e.g., Afumigatus_proteins|433) corresponds to an actual protein sequence from one of the input protome files.

Process and Statistics

The Process and Statistics modules share the same input, which are sequence alignment files. The supported input formats are:

  • Fasta
  • Phylip
  • Nexus
  • Loci (PyRAD)
  • Stockholm

The input format, sequence type (nucleotide or protein) and string formatting (leave or interleave) of the provided alignment files are automatically detected by TriFusion. The missing data symbol used in the input alignments will also be automatically detected from the three possible symbols of x, n or ?.

Note

Is there any constraint on how formats and sequence types can be loaded?

No. You can load files of multiple formats and sequence types all at once. All information will be automatically detected for each input alignments separately.

How to load data into the app

Note

Data availability for this tutorial: the small data set of 7 alignment files is available here.

Filechooser

Proteome and sequence alignment files can be loaded through the application’s file browser. To do so, navigate to Menu -> Open/View Data and click the Open file(s) button.

pic

This will open the main file browser, which supports a couple of features:

  • A list of bookmarks is displayed on the left, and any directory can be added to this list by opening it and clicking the + button or pressing Ctrl + D.
  • On the top of the screen, you can choose the input data type (whether you are loading proteome or alignment files).
  • Below you can find the path of the current directory and several utility buttons to navigate the file browser.
  • At the bottom of the file browser, there is a text field that searches folders and files in the current directory. There is also a drop down menu that filters files according to their extension.
pic

Navigate through the file browser by double clicking directories or clicking on the > symbol. Multiple files can be selected by pressing either the Ctrl or Shift keys. After completing you selection, click the Load & go back button to load the data and go back to the previous screen. If you wish to load additional data, click the Load selection button, which will load the data but remain in the file browser screen. In the example below, 7 files have been selected and are ready to be loaded.

pic

Note

TriFusion also supports the selection of one or more directories instead of files!

When directories are selected, all files contained in those directories will be loaded into TriFusion. If you are worried that not all files in a directory are alignments/proteomes, do not worry. TriFusion will ignore invalid input files while successfully loading valid alignment/proteome files.

Drag and Drop

Input files can be provided to TriFusion’s window directly from your systems’ file manager. After selecting the files, drag them into TriFusion’s window, which will display a popup informing of how many files will be loaded and asking whether the files represent alignments, proteomes or groups. Directories can also be dragged as well. In the example below, 7 sequence alignment files are loaded using this method.

pic

Via terminal

For terminal lovers (<3) files can be loaded automatically when executing the TriFusion application. If TriFusion’s executable is already in you $PATH environmental variable, you can write it in the terminal and then provide any number of files.

pic

This will open TriFusion and automatically open a popup informing that 7 files will be loaded into TriFusion and asking whether the files represent alignment, proteome or group files. In this case, the data files correspond to alignments.

pic

Once the sequence type is selected, the selected files will be loaded normally into TriFusion.

pic