1 Applying %IMP to a
new biological system {#biosystem}
2 ========================================
4 We have already applied %IMP to solve the structures of many novel biological
5 systems, listed on the [biological systems page](https:
6 Each system on that page includes all of the files needed to reproduce the
7 results in the accompanying publication. For example, the list includes the
8 [modeling example from earlier in
this manual](@ref rnapolii_stalk), as well
9 as [modeling of the Nup84 subcomplex of the Nuclear Pore Complex](https:
10 to make sure that it still works correctly.
12 To apply %IMP to a
new biological system, you are welcome to use one of the
13 existing systems, such as the [Nup84 model](https:
14 as your
template - or you can write from scratch
using the basic %IMP classes
15 and/or the IMP::pmi higher level interface. In either
case, we strongly
16 recommend that you manage your application as a GitHub repository so that
17 - others can reproduce your published work
18 - changes to the protocol can be documented or rolled back
if necessary
19 - your system can be added to [our list](https:
21 We recommend the following contents
for your repository (see the
22 [Nup84 repository](https:
25 - subdirectories containing
26 - your modeling protocol (generally one or more Python scripts)
27 - input files (e.g. PDB files, EM density maps, lists of crosslinks),
28 especially
if these files aren
't in a database somewhere already
29 - outputs (trajectories, clusters, analysis). Where this isn't possible
30 due to size, we can host the larger files, such as trajectories, elsewhere
31 (e.g. as a dataset in [Zenodo](https:
33 - a top-level `%README.md` file describing the system and explaining how to
35 - a top-level `LICENSE` file with the license
for the data files and scripts.
36 This doesn
't need to be the same license (LGPL/GPL) that %IMP uses; in fact,
37 for data files one of the [Creative Commons](https://creativecommons.org/)
38 licenses probably makes more sense. We recommend the
39 [CC BY-SA license](https://creativecommons.org/licenses/by-sa/4.0/)
40 which allows anybody to use and modify the data under the same terms, as
41 long as they cite the original work.
42 - a `test` directory containing one or more Python scripts with names starting
43 with `test`. It should be
44 possible to run these scripts without any "special" setup (e.g. they should
45 not require any input arguments or environment variables, or use
46 hard-coded paths). These scripts should run as much of your modeling
47 protocol as possible, and ideally test the results (e.g. by comparing models
48 against 'known good
' clusters). Each script should simply exit with a
49 non-zero exit code (e.g. by raising an exception) if something failed; one
50 easy way to do this nicely is to use Python's
52 tests should run in a
"reasonable" amount of time (no more than 48 hours)
53 on a single processor. If
this is not enough time to run your entire
54 protocol, run only a representative subset
55 (e.g. the Nup84 modeling test passes a `--test` option to the modeling
56 script, which has it perform fewer iterations of sampling).
57 - to add your system to [our list](https:
58 it will also need a `metadata` subdirectory (also
61 This should contain two files:
62 - `thumb.png`: a small image used to represent your system on the page.
63 - `metadata.yaml`: a file in [YAML](http:
64 (see also the [Nup84 example](https:
65 - `title`: a
short descriptive name
for your system
66 - `tags`: a list of tags to group your system with others that use
67 similar methods or input data
68 - `pmid`: the PubMed ID of the accompanying publication
69 - `prereqs`: a list of any non-standard packages that are needed
70 (in addition to %IMP and Python
's standard library) to run the scripts
71 - `runtime`: upper limit to the time the tests will take to run
72 - `build`: which type of %IMP build to run the tests with
73 (`release`, `fast` or `debug`); `release` is generally recommended
74 - `parallel`: if set, the tests will be run in an MPI environment, with
75 the given number of cores available (by default, a serial environment