Product of Cytogene CG 14732
This piece is based on a protein of unknown function encoded in the Drosophila genome. The sequence of the Drosophila genome was reported in 2000 (Science Mar 24 2000: 2185-2195), but many of the protein coding genes have unknown functions.
If the function of the protein is unknown, how do we know that anything is encoded by this region? Protein coding genes are characterized by "open reading frames". Open reading frames are stretches of DNA that begin with a start signal, end with a stop signal, and contain a continuously translatable sequences in between. Cytogene CG14732 is located in the right arm of Drosophila chromosome 3. The entire translatable 874-amino acid sequence of the open reading frame appears in the box below. Since the function of the protein has not been identified, and its folding structure has not been analyzed, no features of the protein are indicated.
<=Back to Samples
Drosophila Cytogene CG 14732
Translation of the Open Reading Frame
MFPGGNGNAD LGYNKHAHVH GLGQHGHGHN HNHNQFLAQP PPPPTHFSLP SGGGQGMVTP MVAAGLPLAM QGGVGIDWAQ LAQQWIHMRD ATPVPMPLAP PPPIISNLRE YHQQTLAMPV PSVVRAGFPQ LEEHGEADMD MDDENDRGHS TETPPPPAPL VTQSQWLAGT EVAGQIPDAS NGAAVTLGKQ PWTGWQMQTD KSGNSTAHIP SLLKLNVSNP NEPHQQLQLQ LQAHQQQQQQ HAPQPTHPHH PPAHQPHAHP HHLTPHTHLP PQHTQQTHAE SGSSGASSEI DANKRKMLPA WIREGLEKME REKQRQLERQ QSSMTTPDVE VNHVKQVSSK TLTTPGNLLN IANVASDSED SIDVPVEARG ELKVQQIGNE LISKDDDRSS SSSEEPESEL HGAIETGNRT VAEHVRMATV NDVNGKNYEE RLADLMLVVR RTLTEILLET TNEEIAAIAG ETLKAHRAKA SSAQVIRKSA LSSITGNLGL AAYGDSSSET EDDEDDEDER QAGAGKDAEK SAQLSAEELK ARIRRSKRSF EKVIDDIEDR VAKQELLDEQ TLLRHRKREL ERSVTGGGEH RRPAANPEPP SESASQATQE KHQQSNGKRL SRKERTTRFS DNKDGKQQSQ SFVQQVVATA VVPPPGSLSN SSPQKLKPVP NPATNLLQMP ESVATMLTAA DKALHKANKS SHKKSKRRHS SPSSSSSSSG SSSSSDSDDS SSTSSGSKTS GSRSKRSSRH GHRSHHSSSR SKYERRDRDR ERERARERER DRDRSHRQHR SSQLSGSHHQ KQRRHRESSH SGEDAGGSSS YHRSSRQSSS RSHDHGSNRS HDHGSSSSGR KRHRTRSRSK SRSTHHSSSK AYSSTSASHP RKRH
Notes on the Music:
This piece plays both the sequence of the protein and that of its encoding DNA. The DNA is played throughout by percussion instruments that beat out the DNA sequence in sets of three notes -- the triplet codons of the DNA. The protein is played by English horn, and is embellished periodically by human voices singing the protein sequence at a slower tempo.
This music was composed using BankStep sequencing software from Algorithmic Arts, and was created specifically for a genetics student who did a research project on a Drosophila mystery protein.