New Feature: Extracting rows and columns from a file

The GenomeSpace team is pleased to announce a new GenomeSpace feature: the addition of server-side row and column extraction.

GenomeSpace now has server-side row and column extraction that allows you to pull a set of rows and columns out of your tab- or comma-delimited data file and save it as its own file in GenomeSpace, without having to download the original file or re-upload the resulting filtered file.  The Extract Rows and Columns dialog allows you to quickly and easily:

  • trim header lines from the top of a file
  • trim lines from the end of a file
  • extract selected columns from a file
  • save the subset of a GenomeSpace file as a new file

To access this feature you can either:

  • Right-click on a file and select Extract rows/cols from the pop-up menu
  • Select one file checkbox (only a single file) and then select File>Extract rows and columns in the menu bar.

This opens a dialog, showing only the first 10 or so lines (or the first 50kb if there are a lot of columns) of the selected file.

From this dialog, you can select the first row at which to start the row and column extraction (you can trim out header lines by starting at a lower row, for example) and the last row to include (leave this blank to take the rest of the file from the starting row). Then you can select columns by checking the checkbox at the top of each column.  If you want to select many columns at once, you can use the 'toggle all columns' link which will check any unchecked columns and uncheck any that were previously checked. The rows/columns that have been selected to go to the new file are highlighted in light blue, while those that will be cut out are displayed as grey text on a white background.

Finally, decide on the file name you want to use for the new subset file.  The default will be to add .slice to the end of the source filename, leaving the file extension intact.  For example, if your source filename is myfile.gct, the default extracted file name will be myfile.slice.gct.  Note that if you are removing header lines you may also want to change the file extension to match the new format.

Click Save to create a new GenomeSpace file in the same GenomeSpace directory as the original file.