It seems to me that the division-of-labour concept under investigation here could potentially…

Dec 31, 2023

It seems to me that the division-of-labour concept under investigation here could potentially extend beyond efficient flash/DRAM utilization. It could allow entire parts the network to be in the cloud, while time-critical elements are on-device. Information going over the network would be limited to tensor data, and would thus be meaningful only to the rest of the on-device model. So user data remains secure. This could allow for a hugely scalable approach to running LLMs on mobile devices.

Written by James B Maxwell

Responses (2)