About: Abstract A novel clustering method is proposed to classify genes or genomes. This method uses a natural representation of genomic data by binary indicator sequences of each nucleotide (adenine (A), cytosine (C), guanine (G), and thymine (T)). Afterwards, the discrete Fourier transform is applied to these indicator sequences to calculate spectra of the nucleotides. Mathematical moments are calculated for each of these spectra to create a multidimensional vector in a Euclidean space for each gene or genome sequence. Thus, each gene or genome sequence is realized as a geometric point in the Euclidean space. Finally, pairwise Euclidean distances between these points (i.e. genome sequences) are calculated to cluster the gene or genome sequences. This method is applied to three sets of data. The first is 34 strains of coronavirus genomic data, the second is 118 of the known strains of Human rhinovirus (HRV), and the third is 30 bacteria genomes. The distance matrices are computed based on the three sets, showing the distances from each point to the others. We used the complete linkage clustering algorithm to build phylogenetic trees to indicate how the distances among these sequence correspond to the evolutionary relationship among these sequences. This genome representation provides a powerful and efficient method to classify genomes and is much faster than the widely acknowledged multiple sequence alignment method.

Facets (new session)
Description
Metadata
Settings
- owl:sameAs
- Inference Rule:

About: Abstract A novel clustering method is proposed to classify genes or genomes. This method uses a natural representation of genomic data by binary indicator sequences of each nucleotide (adenine (A), cytosine (C), guanine (G), and thymine (T)). Afterwards, the discrete Fourier transform is applied to these indicator sequences to calculate spectra of the nucleotides. Mathematical moments are calculated for each of these spectra to create a multidimensional vector in a Euclidean space for each gene or genome sequence. Thus, each gene or genome sequence is realized as a geometric point in the Euclidean space. Finally, pairwise Euclidean distances between these points (i.e. genome sequences) are calculated to cluster the gene or genome sequences. This method is applied to three sets of data. The first is 34 strains of coronavirus genomic data, the second is 118 of the known strains of Human rhinovirus (HRV), and the third is 30 bacteria genomes. The distance matrices are computed based on the three sets, showing the distances from each point to the others. We used the complete linkage clustering algorithm to build phylogenetic trees to indicate how the distances among these sequence correspond to the evolutionary relationship among these sequences. This genome representation provides a powerful and efficient method to classify genomes and is much faster than the widely acknowledged multiple sequence alignment method. Goto Sponge NotDistinct Permalink

An Entity of Type : fabio:Abstract, within Data Space : covidontheweb.inria.fr associated with source document(s)

Attributes	Values
type	abstract
value	Abstract A novel clustering method is proposed to classify genes or genomes. This method uses a natural representation of genomic data by binary indicator sequences of each nucleotide (adenine (A), cytosine (C), guanine (G), and thymine (T)). Afterwards, the discrete Fourier transform is applied to these indicator sequences to calculate spectra of the nucleotides. Mathematical moments are calculated for each of these spectra to create a multidimensional vector in a Euclidean space for each gene or genome sequence. Thus, each gene or genome sequence is realized as a geometric point in the Euclidean space. Finally, pairwise Euclidean distances between these points (i.e. genome sequences) are calculated to cluster the gene or genome sequences. This method is applied to three sets of data. The first is 34 strains of coronavirus genomic data, the second is 118 of the known strains of Human rhinovirus (HRV), and the third is 30 bacteria genomes. The distance matrices are computed based on the three sets, showing the distances from each point to the others. We used the complete linkage clustering algorithm to build phylogenetic trees to indicate how the distances among these sequence correspond to the evolutionary relationship among these sequences. This genome representation provides a powerful and efficient method to classify genomes and is much faster than the widely acknowledged multiple sequence alignment method.
subject	Genomics Cluster analysis algorithms Genetic mapping Moment (mathematics) Nucleobases Pythagorean theorem
part of	A novel clustering method via nucleotide-based Fourier power spectrum analysis
is abstract of	A novel clustering method via nucleotide-based Fourier power spectrum analysis
is hasSource of	covid:ann/target/0527942906ab5ca54b6c33ee830c5e05dddeadd7 covid:ann/target/05bbd99eb424057c03ee3288ea99330beaea3a87 covid:ann/target/56917c15bfc971b45904feb4b9e441511b72f915 covid:ann/target/65182b38432f0c4393f30efeefb37bcb45eee808 covid:ann/target/863105ca79124f8630c780ed1048ee6170b80572 covid:ann/target/91dbb746a6972ca5a5f7ea547efd46acd220130f covid:ann/target/9ec6e4d811f9cb62bfffca2cff18b28f824a05d8 covid:ann/target/d1ab841fb360564b61cb28be36c1b74b87d65d5d covid:ann/target/d811b634c740541b63f2b5672479b52dc27195e1 covid:ann/target/f9e0b2fab367aca29081f6a0e880a9463b9c34fe covid:ann/target/fdd9a4b2e0c53e7f90b939a26c84a6be87f075cb covid:ann/target/0277de2db78c683543970e3df2eca432dda75b3e covid:ann/target/1ab03ad00afd5d8c6e09d73971b4cc21f8b10239 covid:ann/target/389bdcc89ec155cf062b8ea49b4e8d81d7edbf4e covid:ann/target/485b708152ae270665452849ff6e165506fdfefd covid:ann/target/59dffbabf2d59779afa31ab4e547e52481ea1a10 covid:ann/target/5fb651c76f52b36f2877247375db84a4b09267ac covid:ann/target/618898e081170b9c21f4bbfbfdcf2a70c8ee8ccb covid:ann/target/764706953cae939f7f852aa6cd1547d404672752 covid:ann/target/cdd78e38a08ffbea12da58cb4209c326311e6e80 covid:ann/target/d8ad94d8486a425ab9bfd0fac0c277ac97625e8e covid:ann/target/60b4efc50686a4d42651bc743bdb91e4b7ab7edf covid:ann/target/f84dd844a6d4b92d21cb0b58af12e3d93ab833e0 covid:ann/target/2ce7aa47cd31d32d0fc7285434dca6c6e2999a65 covid:ann/target/f3701591a756d518491fdadc2da3353e5d451d16 covid:ann/target/cb651bcdc76424098f48abfe9063d2cfd8fa8359 covid:ann/target/fd5f3cd2747e35b0ee945726c2416c3619bb2a02 covid:ann/target/31040a2cfcbff8da4992a2d7293acf10cf97fcf7 covid:ann/target/545e8a6cd540d17ac6ad4e4ab43d8993d8a351ad covid:ann/target/55463eba461eec17f884fd70f8d48947ee971568 covid:ann/target/92a53f24c28742e0fc86dcacfd69259d6fda345d covid:ann/target/ca3dc682a5a56d27827aef5065634f125978257b covid:ann/target/e0ad97026ba8ffca65ac02406683bdca4aac74e0 covid:ann/target/29f8f85e0f6bd7cb0c757d4cf77a919b0061ef6f covid:ann/target/5b39ede59755d82a27ca773362527d0b23d1f50a covid:ann/target/85d2128c281faaa56b0a79fa9f19569236f9ddd9 covid:ann/target/89d9c7bc8be110e07757f8bcd683b7e0e8c91b75 covid:ann/target/9507d3e1f8e2c9c0b31b55c0fdf66103fd54d95a covid:ann/target/9c903eb18bb31ac69f08c734d705d587c8931d16 covid:ann/target/9ded249d5331db4d078141864e1025325e36a5f7 covid:ann/target/aadc82ec5e782d4639968a2d046eb69968ed40e3 covid:ann/target/f8a1f1caff83e958ee038dd1e181784a84449b06 covid:ann/target/cd2692669ce3f681ce63b1a0c6f5dccea869230c covid:ann/target/419aca671fd6425e7067632891f117a88354c127 covid:ann/target/59646cf7d8748cfc6c779fd7ebda3db1a8247a18 covid:ann/target/6d2e989be5c6279ce64085d25ee335a92835641d covid:ann/target/81d56f2370ce60ee70f210b79cf552ab81b5c272 covid:ann/target/b86cc7e522e8c237e75cb685378287b69be989ec covid:ann/target/ba197ae4fc282b48ffac0f244ad65a6433016ae6 covid:ann/target/bd330ae0475bf49c86414ba9ec9b2166c8daee2f covid:ann/target/be1b87ccb120666f3852c03cfb75b2c35c06336d covid:ann/target/cb26102878c853a3accd1ba0d05c4315cd917fc9 covid:ann/target/f4b62cd9bee8109f2f725772dd813bbf02b3886b covid:ann/target/f5a82e2892af7c11767667347fb20df0d15071e8 covid:ann/target/0183962c809d983b4819d476be50006524ca405f covid:ann/target/160576bf38738396edc87314b808cbf68b0f7e07 covid:ann/target/301af4f8060f93befb6916e8f6f63ce83ed387cf covid:ann/target/4d03f8bdab2f3b964a5b23cd2ba74e613459e2e8 covid:ann/target/544a8c6e3e2367d9af0a0b6e5350ac81606f139c covid:ann/target/8b96d272e7767218fac4492070eb6474ffb1883a covid:ann/target/9d07fbc6514a0229fb973a8103d9c03f050f70a6 covid:ann/target/a1338fa6b499dbb4cd06eb6f8f3d27afe9151381 covid:ann/target/dd4c947a528b13d850a4cdb68095412305af6b2f covid:ann/target/f00094f9d36f213c46facdb079fda39af2dfecff covid:ann/target/1508c70dc2815bc8016301fb8987669821e50efc covid:ann/target/3c7987f904c7ee06e79670157dd3ed1586d91ef1 covid:ann/target/4c0d46af859e190dad4c1f1040e140e7b680a979 covid:ann/target/954444d401d65786938b24babbf88efb5732bf36 covid:ann/target/ad80c918c3de3d999ec3a3574d2a62fed97be362 covid:ann/target/2c783a3a6688dbc58ba519ba88907b999bf45063 covid:ann/target/6cb455a2436544c32efd43a59c25c535de4be233 covid:ann/target/78ac7dca68e72b5b8335231f7502ee3587922b16 covid:ann/target/aa6ec918ee3ac282fa56c29ade2a9c6c0a684ae5 covid:ann/target/f072c33005283eeef6bbcdd914e500e4b0eaa47f covid:ann/target/14ccea5c1a34c01633207bf2ed32110f6d0c8990 covid:ann/target/d4ded2f06836e0a3bf9e581ade7a4c694947285f covid:ann/target/72e8e60e644113be2c70c46809a288504c2c60b1 covid:ann/target/e23fb35b12170abf24a4efbeb76d2887740b62c4 covid:ann/target/7ad74213245c9e53dc1c969eee1f688c2b1dd08c covid:ann/target/1c9191c5f12ac98238310990d12e60fcac98b8ae covid:ann/target/5f96a30b800ca1d31b01d2114a7aa226e5255c8f covid:ann/target/4cbdd1bd231b9f5930337e46242fc5313da7710b covid:ann/target/66d29c5f517e2ecc7724503d90278159b8da0ae7 covid:ann/target/65153cf6c867fd05a413ad666dffa031abc316c4 covid:ann/target/ed771677fe85984cd2fa4d4de757bb24f743290d covid:ann/target/44db7a7cb5756dcf1296949a990dedbdab533f58 covid:ann/target/d2ea479c146f33582b91d455c103910befcfc54f covid:ann/target/a2621f00c6c66e8e631843bc6e4c61514abd233e covid:ann/target/14ef6f9e5c57d22b3200bd94a9bed3bea585e2ce covid:ann/target/4a48335d5d9d651af7d1f518c50ff86a5a0dd978

Faceted Search & Find service v1.13.91 as of Mar 24 2020

Alternative Linked Data Documents: Sponger | ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3229 as of Jul 10 2020, on Linux (x86_64-pc-linux-gnu), Single-Server Edition (94 GB total memory)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2025 OpenLink Software