Digital Archives Initiative
Memorial University - Electronic Theses and Dissertations 4
menu off  add document to favorites : add page to favorites : reference url back to results : previous : next
 Search this object:
 0 hit(s) :: previous hit : next hit
  previous page : next page
Document Description
TitleA vector floating point processing unit design
AuthorChen, Shi, 1976-
DescriptionThesis (M.Eng.)--Memorial University of Newfoundland, 2008. Engineering and Applied Science
Paginationxii, 104 leaves : ill.
SubjectField programmable gate arrays; Floating-point arithmetic; Vector processing (Computer science);
Degree GrantorMemorial University of Newfoundland. Faculty of Engineering and Applied Science
DisciplineEngineering and Applied Science
NotesIncludes bibliographical references (leaves 101-104)
AbstractThe main contribution of this thesis is the successful development of a vector floating point processing unit for high accuracy science computing. For these numerically-intensive applications, vector processing offers simple and straightforward parallelism by executing mathematical operations on multiple data elements simultaneously. The simple control and datapath structures of vector processing enable the embedded computing system to attain high performance at low power. -- This vector floating point processing unit includes: a vector register file, vector floating point arithmetic units, and vector memory units. The central module, a vector register file, is divided into twelve lanes. One lane contains 16 vector registers, each including 32x32-bit elements, and is connected to a floating point adder and a floating point multiplier. By modeling the multi-port register file using configurable block RAM on Field Programmable Gate Arrays (FPGA) target, vector register files can efficiently obtain data from external memory and feed data to different arithmetic units simultaneously. Utilizing the quick carry out path and embedded multiplier macro unit, the vector floating point arithmetic units can run at over 200 MHz. A flag register is used to indicate the calculation sequence for the specific computing model. Moreover, the embedded Power PC processor not only can easily control the calculation flow, but also can support an embedded operating system to extend a broad range of applications. The prototype is implemented on Xilinx Virtex II Pro devices, and a peak performance of 4.530 GFLOPS at 188.768 MHz has been achieved. -- First, we present a brief introduction to the floating point arithmetic operations, including addition, multiplication, and multiplier-adder-fused. Second, the architecture of the vector processing unit and a detailed description of vector function units are introduced. Moreover, for a specific computing application, the appropriate overlap execution scheme is discussed. In the end, the performance of each component is analyzed, and the time and area analysis of whole system is provided.
Resource TypeElectronic thesis or dissertation
FormatImage/jpeg; Application/pdf
SourcePaper copy kept in the Centre for Newfoundland Studies, Memorial University Libraries
Local Identifiera2542352
RightsThe author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.
CollectionElectronic Theses and Dissertations
Scanning StatusCompleted
PDF File(9.88 MB) --
CONTENTdm file name29766.cpd