56 - NHR PerfLab Seminar 2023-09-05: DGEMM on Integer Tensor Cores/ClipID:49059 vorhergehender Clip nächster Clip

Die automatischen Untertitel, die mit Whisper Open AI in diesem Video-Player (und im Multistream-Video-Player) generiert werden, dienen der Bequemlichkeit und Barrierefreiheit. Es ist jedoch zu beachten, dass die Genauigkeit und Interpretation variieren können. Für mehr Informationen lesen Sie bitte die FAQs (Absatz 14)
Aufnahme Datum 2023-09-05

Kurs-Verknüpfung

HPC4FAU / NHR@FAU

Lehrende(r)

Dr. Georg Hager

Zugang

Frei

Sprache

Englisch

Einrichtung

Zentrum für Nationales Hochleistungsrechnen Erlangen (NHR@FAU)

Produzent

Zentrum für Nationales Hochleistungsrechnen Erlangen (NHR@FAU)

Speaker: Hiroyuki Ootomo, Tokyo Institute of Technology

Title: DGEMM on Integer Tensor Cores

Date and time: Tuesday, September 5, 2 p.m. – 3 p.m.

Abstract:

In order to meet the increasing demand for dense matrix-matrix multiplication from the deep learning community, processors with specialized computing units for matrix multiplication are being developed by numerous vendors, such as NVIDIA Tensor Cores and Google TPUs. These hardware are designed to efficiently perform matrix multiplication at low precision, taking advantage of the fact that deep learning can tolerate low-precision operations, and the computation heavily relies on matrix multiplications. For machine learning inference, fixed-point value computation is commonplace, where the input and output values and the model parameters are quantized. Thus, many processors are now equipped with fast integer matrix multiplication units. This talk introduces a double-precision equivalent matrix multiplication using Int8 Tensor Cores and the Ozaki scheme, a high-precision matrix multiplication scheme using a lower-precision computing unit.

Short bio:

Hiroyuki Ootomo is a Ph.D. candidate at Tokyo Institute of Technology and studying under Dr. Rio Yokota. His research interests lie in high performance computing, especially mixed-precision computing using special hardware, randomized numerical linear algebra, and quantum circuit simulation. His current work is on a fast and high-accuracy GEMM on NVIDIA Tensor Cores and its application.

For a list of past and upcoming NHR PerfLab seminar events, see: https://hpc.fau.de/research/nhr-perflab-seminar-series/

Mehr Videos aus der Kategorie "Friedrich-Alexander-Universität Erlangen-Nürnberg Zentralbereich"

2024-12-17
IdM-Anmeldung
geschützte Daten  
2024-12-17
IdM-Anmeldung / Studon
geschützte Daten  
2024-12-16
IdM-Anmeldung / Studon
geschützte Daten  
2024-12-27
IdM-Anmeldung
geschützte Daten  
2024-12-16
IdM-Anmeldung
geschützte Daten