Bioplib
Protein Structure C Library
 All Data Structures Files Functions Variables Typedefs Macros Pages
Functions
ReadRawPIR.c File Reference
#include <stdio.h>
#include <stdlib.h>
#include "SysDefs.h"
#include "seq.h"
#include "macros.h"

Go to the source code of this file.

Functions

int blReadRawPIR (FILE *fp, char **seqs, int maxchain, BOOL upcase, SEQINFO *seqinfo, BOOL *error)
 

Detailed Description

Version
V2.8
Date
07.07.14
Author
Dr. Andrew C. R. Martin
Institute of Structural & Molecular Biology, University College London, Gower Street, London. WC1E 6BT.
andre.nosp@m.w@bi.nosp@m.oinf..nosp@m.org..nosp@m.uk andre.nosp@m.w.ma.nosp@m.rtin@.nosp@m.ucl..nosp@m.ac.uk

This code is NOT IN THE PUBLIC DOMAIN, but it may be copied according to the conditions laid out in the accompanying file COPYING.DOC.

The code may be modified as required, but any modifications must be documented so that the person responsible can be identified.

The code may not be sold commercially or included as part of a commercial product except as described in the file COPYING.DOC.

Description:

Usage:

int ReadRawPIR(FILE *fp, BOOL DoInsert, char **seqs, int maxchain,
SEQINFO *seqinfo, BOOL *punct, BOOL *error)

As ReadPIR(), but reads punctuation characters without taking any special action. Used when punctuation characters have been used to indicate consensus sequence features.

Revision History:

Definition in file ReadRawPIR.c.

Function Documentation

int blReadRawPIR ( FILE *  fp,
char **  seqs,
int  maxchain,
BOOL  upcase,
SEQINFO seqinfo,
BOOL error 
)
Parameters
[in]*fpFile pointer
[in]maxchainMax number of chains to read. This is the dimension of the seqs array. N.B. THIS SHOULD BE AT LEAST 1 MORE THAN THE EXPECTED MAXIMUM NUMBER OF SEQUENCES
[in]upcaseShould lower-case letters be upcased?
[out]**seqsArray of character pointers which will be filled in with sequence information. Memory will be allocated for any sequence length.
[out]*seqinfoThis structure will be filled in with extra information about the sequence. Header & title information and details of any punctuation.
[out]*errorTRUE if an error occured (e.g. memory allocation)
Returns
Number of chains in this sequence. 0 if file ended, or no valid sequence entries found.

This is based on ReadPIR(), but reads all characters into the sequence arrays (i.e. all punctuation characters are read as is). This is useful when punctuation has been used to indicate consensus sequence features.

The only requirements of the code are that the PIR file should have 2 title lines per entry, the first line starting with a > sign. The routine will handle multiple sequence files. Successive calls will return information on the next entry. The routine will return 0 when there are no more entries.

Header line: Must start with >. Will handle files which don't have the proper P1; or F1; parts of the header as well as those which do.

Title line: Will read the name and source fields if correctly separated by a -, otherwise copies all information into the name.

White space and line breaks are ignored. Each chain should end with a *, but the routine will accept the last chain of an entry with no . While the standard requires upper case text, this routine will handle lower case and convert it to upper case. While the routine does pretty well at last chains not terminated with a *, a last chain ending with a / not followed by a * but followed by a text line will be identified as incomplete rather than truncated. If the DoInsert flag is set, - signs in the sequence will be read as part of the sequence, otherwise they will be skipped. This is an addition to the PIR standard.

Text lines: Text lines after an entry (beginning with R;, C;, A;, N; or F;) are ignored.

  • 28.02.95 Original based on ReadPIR() By: ACRM
  • 13.03.95 chpos++ had got moved wrongly when adapting from ReadPIR(). Put it back fixing handling of text lines.
  • 26.07.95 Removed unused variables
  • 06.02.96 Remove any trailing spaces

Definition at line 169 of file ReadRawPIR.c.