chado.rst 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296
  1. Accessing Chado
  2. ================
  3. Primarily biological data made available to Tripal is stored in the GMOD Chado
  4. schema. As such, you will likely need to interact with Chado at some point.
  5. Tripal has developed a number of API functions and classes to make this
  6. interaction easier and more generic.
  7. The Chado Query API
  8. --------------------
  9. Provides an API for querying of chado including inserting, updating, deleting and selecting from specific chado tables. There is also a generic function, ``chado_query()``, to execute and SQL statement on chado. It is ideal to use these functions to interact with chado in order to keep your module compatible with both local & external chado databases. Furthermore, it ensures connection to the chado database is taken care of for you.
  10. Generic Queries to a specifc chado table
  11. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  12. Selecting Records
  13. """"""""""""""""""
  14. ``chado_select_record( [table name], [columns to select], [specify record to select], [options*] )``
  15. This function allows you to select various columns from the specified chado table. Although you can only select from a single table, you can specify the record to select using values from related tables through use of a nested array. For example, the following code shows you how to select the name and uniquename of a feature based on it's type and source organism.
  16. .. code-block:: php
  17. $values = array(
  18. 'organism_id' => array(
  19. 'genus' => 'Citrus',
  20. 'species' => 'sinensis',
  21. ),
  22. 'type_id' => array (
  23. 'cv_id' => array (
  24. 'name' => 'sequence',
  25. ),
  26. 'name' => 'gene',
  27. 'is_obsolete' => 0
  28. ),
  29. );
  30. $result = chado_select_record(
  31. 'feature', // table to select from
  32. array('name', 'uniquename'), // columns to select
  33. $values // record to select (see variable defn. above)
  34. );
  35. Inserting Records
  36. """"""""""""""""""
  37. ``chado_insert_record( [table name], [values to insert], [options*] )``
  38. This function allows you to insert a single record into a specific table. The values to insert are specified using an associative array where the keys are the column names to insert into and they point to the value to be inserted into that column. If the column is a foreign key, the key will point to an array specifying the record in the foreign table and then the primary key of that record will be inserted in the column. For example, the following code will insert a feature and for the type_id, the cvterm.cvterm_id of the cvterm record will be inserted and for the organism_id, the organism.organism_id of the organism_record will be inserted.
  39. .. code-block:: php
  40. $values = array(
  41. 'organism_id' => array(
  42. 'genus' => 'Citrus',
  43. 'species' => 'sinensis',
  44. ),
  45. 'name' => 'orange1.1g000034m.g',
  46. 'uniquename' => 'orange1.1g000034m.g',
  47. 'type_id' => array (
  48. 'cv_id' => array (
  49. 'name' => 'sequence',
  50. ),
  51. 'name' => 'gene',
  52. 'is_obsolete' => 0
  53. ),
  54. );
  55. $result = chado_insert_record(
  56. 'feature', // table to insert into
  57. $values // values to insert
  58. );
  59. Updating Records
  60. """"""""""""""""""
  61. ``chado_update_record( [table name], [specify record to update], [values to change], [options*] )``
  62. This function allows you to update records in a specific chado table. The record(s) you wish to update are specified the same as in the select function above and the values to be update are specified the same as the values to be inserted were. For example, the following code species that a feature with a given uniquename, organism_id, and type_id (the unique constraint for the feature table) will be updated with a new name, and the type changed from a gene to an mRNA.
  63. .. code-block:: php
  64. $umatch = array(
  65. 'organism_id' => array(
  66. 'genus' => 'Citrus',
  67. 'species' => 'sinensis',
  68. ),
  69. 'uniquename' => 'orange1.1g000034m.g7',
  70. 'type_id' => array (
  71. 'cv_id' => array (
  72. 'name' => 'sequence',
  73. ),
  74. 'name' => 'gene',
  75. 'is_obsolete' => 0
  76. ),
  77. );
  78. $uvalues = array(
  79. 'name' => 'orange1.1g000034m.g',
  80. 'type_id' => array (
  81. 'cv_id' => array (
  82. 'name' => 'sequence',
  83. ),
  84. 'name' => 'mRNA',
  85. 'is_obsolete' => 0
  86. ),
  87. );
  88. $result = chado_update_record('feature',$umatch,$uvalues);
  89. Deleting Records
  90. """""""""""""""""
  91. ``chado_delete_record( [table name], [specify records to delete], [options*] )``
  92. This function allows you to delete records from a specific chado table. The record(s) to delete are specified the same as the record to select/update was above. For example, the following code will delete all genes from the organism Citrus sinensis.
  93. .. code-block:: php
  94. $values = array(
  95. 'organism_id' => array(
  96. 'genus' => 'Citrus',
  97. 'species' => 'sinensis',
  98. ),
  99. 'type_id' => array (
  100. 'cv_id' => array (
  101. 'name' => 'sequence',
  102. ),
  103. 'name' => 'gene',
  104. 'is_obsolete' => 0
  105. ),
  106. );
  107. $result = chado_select_record(
  108. 'feature', // table to select from
  109. $values // records to delete (see variable defn. above)
  110. );
  111. Generic Queries for any SQL
  112. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  113. Often it is necessary to select from more then one table in chado or to execute other complex queries that cannot be handled efficiently by the above functions. It is for this reason that the ``chado_query( [sql string], [arguments to sub-in to the sql] )`` function was created. This function allows you to execute any SQL directly on the chado database and should be used with care. If any user input will be used in the query make sure to put a placeholder in your SQL string and then define the value in the arguments array. This will make sure that the user input is sanitized and safe through type-checking and escaping. The following code shows an example of how to use user input resulting from a form and would be called with the form submit function.
  114. .. code-block:: php
  115. $sql = "SELECT F.name, CVT.name as type_name, ORG.common_name
  116. FROM feature F
  117. LEFT JOIN cvterm CVT ON F.type_id = CVT.cvterm_id
  118. LEFT JOIN organism ORG ON F.organism_id = ORG.organism_id
  119. WHERE
  120. F.uniquename = :feature_uniquename";
  121. $args = array( ':feature_uniquename' => $form_state['values']['uniquename'] );
  122. $result = chado_query( $sql, $args );
  123. foreach ($result as $r) { [Do something with the records here] }
  124. If you are going to need more then a couple fields, you might want to use the Chado Variables API (specifically ``chado_generate_var()``) to select all of the common fields needed including following foreign keys.
  125. Loading of Variables from chado data
  126. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  127. These functions, ``chado_generate_var()`` and ``chado_expand_var()``, generate objects containing the full details of a record(s) in chado. These should be used in all theme templates.
  128. This differs from the objects returned by ``chado_select_record`` in so far as all foreign key relationships have been followed meaning you have more complete details. Thus this function should be used whenever you need a full variable and ``chado_select_record`` should be used if you only case about a few columns.
  129. The initial variable is generated by the ``chado_generate_var([table], [filter criteria], [optional options])`` function. An example of how to use this function is:
  130. .. code-block:: php
  131. $values = array(
  132. 'name' => 'Medtr4g030710'
  133. );
  134. $features = chado_generate_var('feature', $values);
  135. This will return an object if there is only one feature with the name Medtr4g030710 or it will return an array of feature objects if more than one feature has that name.
  136. Some tables and fields are excluded by default. To have those tables & fields added to your variable you can use the ``chado_expand_var([chado variable], [type], [what to expand], [optional options])`` function. An example of how to use this function is:
  137. .. code-block:: php
  138. // Get a chado object to be expanded
  139. $values = array(
  140. 'name' => 'Medtr4g030710'
  141. );
  142. $features = chado_generate_var('feature', $values);
  143. // Expand the organism node
  144. $feature = chado_expand_var($feature, 'node', 'organism');
  145. // Expand the feature.residues field
  146. $feature = chado_expand_var($feature, 'field', 'feature.residues');
  147. // Expand the feature properties (featureprop table)
  148. $feature = chado_expand_var($feature, 'table', 'featureprop');
  149. The Chado Schema API
  150. --------------------
  151. The Chado Schema API provides an application programming interface (API) for describing Chado tables, accessing these descriptions and checking for compliancy of your current database to the chado schema. This API consists of the ChadoSchema class which provides methods for interacting with the Chado Schema API and a collection of supporting functions, one for each table in Chado, which describe each version of the Chado schema. Each function simply returns a Drupal style array that defines the table.
  152. Ensuring columns Tables & Columns exist
  153. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  154. Generally you can assume the tables and columns in the Chado schema have been unaltered. That said, there are still cases where you might want to check that specific tables and columns exist. For example, when using a custom table, it is best practice to ensure it is there before querying as it can be removed through the administrative interface.
  155. To check the existence of a specific table and column, you can use the following:
  156. .. code-block:: php
  157. $chado_schema = new \ChadoSchema();
  158. // Check that the organism_feature_count custom table exists.
  159. $table_name = 'organism_feature_count';
  160. $table_exists = $chado_schema->checkTableExists($table_name);
  161. if ($table_exists) {
  162. // Check that the organism_feature_count.feature_id column exists.
  163. $column_name = 'feature_id';
  164. $column_exists = $chado_schema->checkColumnExists($table_name, $column_name);
  165. if ($column_exists) {
  166. [ do your query, etc. here ]
  167. } else { [warn the admin using tripal_repot_error()] }
  168. } else { [warn the admin using tripal_repot_error()] }
  169. Checking the Schema Version
  170. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  171. If you are using chado tables specific to a given version of Chado, it is best practice to check the chado version of the current site before querying those tables. You can use the following query to do this:
  172. .. code-block:: php
  173. $chado_schema = new \ChadoSchema();
  174. $version = $chado_schema-getVersion();
  175. if ($version == '1.3') {
  176. [do your chado v1.3 specific querying here]
  177. } else { [warn the admin using tripal_report_error() ] }
  178. Retrieving a list of tables
  179. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  180. To retrieve a list of Chado tables, you can use the following:
  181. .. code-block:: php
  182. $chado_schema = new \ChadoSchema();
  183. // All Chado Tables including custom tables
  184. $all_tables = $chado_schema->getTableNames(TRUE);
  185. // All Chado Tables without custom tables
  186. $all_tables = $chado_schema->getTableNames();
  187. // Chado tables designated as Base Tables by Tripal.
  188. $base_tables = $chado_schema->getBaseTables();
  189. Ensuring your Chado instance is compliant
  190. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  191. Checking compliancy of your Chado instance with the released Chado Schema is a great way to **confirm an upgrade has gone flawlessly**. Additionally, while it is not recommended, sometimes customizations to the Chado schema may be necessary. In these cases, you should **ensure backwards compatibility** through compliance checking to confirm Tripal will work as expected.
  192. Chado compliancy testing is provided with Tripal's automated PHPUnit testing. As such, to test compliancy of your specific Chado instance, you first need to install Composer. Luckily this can be as easy as:
  193. .. code-block:: bash
  194. php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');"
  195. php -r "if (hash_file('SHA384', 'composer-setup.php') === '544e09ee996cdf60ece3804abc52599c22b1f40f4323403c44d44fdfdd586475ca9813a858088ffbc1f233e9b180f061') { echo 'Installer verified'; } else { echo 'Installer corrupt'; unlink('composer-setup.php'); } echo PHP_EOL;"
  196. php composer-setup.php
  197. php -r "unlink('composer-setup.php');"
  198. Once you have Composer, you need to install PHPUnit. This is installed locally within your Tripal repository. The following bash snippet shows you how to both install composer locally and run compliance checking.
  199. .. code-block:: php
  200. cd [DRUPAL_ROOT]/sites/all/modules/tripal
  201. composer up
  202. # Now run compliance checking
  203. ./vendor/bin/phpunit --group chado-compliance
  204. Schema Definition
  205. ^^^^^^^^^^^^^^^^^^
  206. To retrieve the schema definition for a specific table, you can execute the following:
  207. .. code-block:: php
  208. $table_name = 'feature';
  209. $chado_schema = new \ChadoSchema();
  210. $table_schema = $chado_schema->getTableSchema($table_name);
  211. The resulting ``$table_schema`` variable contains a Drupal-style array describing the schema definition of the table specified by ``$table_name``. This is a great tool when trying to develop generic queries, since you can extract information about an unknown table and use it to build a query for that table. For more information on the format of this array, see `the Drupal Schema API documentation <https://api.drupal.org/api/drupal/includes%21database%21schema.inc/group/schemaapi/7.x>`_.