Time- and energy-efficient embedded AI for Tensor Processing Units
LAAS-CNRS
Warszawa, Mazowieckie

staż 15 grudnia, 13:12

Opis stanowiska

This internship will investigate various factors that contribute to the power consumption of deep neural networks when executed on multiple tensor processing units (TPUs). Its aim is to identify the interplay between three factors: inference time, power consumption, and accuracy. The host laboratory is the LAAS-CNRS in Toulouse, France. Qualifications: Good C and Python programming skills, knowledge of Linux and embedded systems as well as basic knowledge of machine learning. Duration: from 3 to 6 months. The ideal starting period is February/March 2024 with some flexibility. Salary: 650 euros/month Please see https://www.laas.fr/ost/node/351 for more details.

Obowiązki osoby zatrudnionej

  • Testbed for multi-TPU power consumption measurement. The instantaneous power consumption of differ- ent neural networks running on a multi-TPU board ASUS CRL-G18U-P3DF can be measured using the current sensors and a microcontroller. A microcontroller (e.g., Arduino Uno or STM32) can collect the measurements in real time from the current sensors (e.g., ACS712 series) connected to the TPU board PCI interface and send them to the host computer. Several frameworks can be implemented with the firmware already available (e.g., PowerSensor2, https://gitlab.com/astron-misc/PowerSensor/).
  • Software optimization techniques and power consumption. Various software optimization methods [1, 5],such as pruning, quantization, or weight sharing, can reduce deep neural networks’ computational requirementsand memory footprint. By applying these optimization techniques and measuring power consumption duringexecution on multi-TPUs, a trade-off between power savings and performance degradation in terms of timeand accuracy can be established. This part of the project will be undertaken in collaboration with Laboratoired’InfoRmatique en Image et Systèmes d’information (LIRIS-CNRS) in Lyon, and the work on structured pruning(i.e., removing less useful neurons or feature maps according to a criterion) is the objective of another internship(https://liris.cnrs.fr/sites/default/files/emploi/sujet_stage_m2_ia3f.pdf).
  • TPU pipeline design. Model pipelining allows to execute different segments of the same model on different TPUsto reduce the inference time (https://coral.ai/docs/edgetpu/pipeline/). To create and test the different modelsegmentations across multiple TPUs, a tool to modify and analyze tflite neural net model files (e.g., Netron)can be used.
  • Benchmarking suite for TPUs. The final step is to create scripts to automate the tests of compressed neu-ral networks with different segmentations and with different numbers of TPUs being used and interpret thebenchmarking results (consumption energy, inference time, precision).

Wymagania

  • Good C and Python programming skills, knowledge of Linux and embedded systems as well as basic knowledge of machine learning.

Oferujemy

  • 650 euros / month

Rodzaj oferty

staż

Wymiar

pełny etat

Forma zatrudnienia

staż

Dla kogo

student

Wykształcenie

  • elektronika i telekomunikacja
  • informatyka

Języki

  • angielski (zaawansowany)

Wynagrodzenie

2800 - 2800 PLN

Wymagane dokumenty

List motywacyjny
Curriculum vitae

Sposób aplikacji

WIDOCZNE PO ZALOGOWANIU