top of page

AI Research Weekly— 7th March 2023

Google Research presents Performer-RPCA mobile robot navigation model using performer architectures.

It builds upon recent research which used transformers to encode robotic policies. Attention models have quadratic time and space complexity which was a hurdle for on-robot deployment due to strict latency requirements for robots. This was solved by development of scalable low-rank implicit-attention Transformers with linear space and time complexity, named as Performer architectures. They were able to provide 8ms on-robot latency and helped robot navigate tight spaces while demonstrating complex socially acceptable behaviours. Performer-MPC: Navigation via real-time, on-robot transformers

Amazon Science presents SLIce A technique for detecting clicks on ads in e-commerce website by robots in real-time. This builds upon previous SLIDR research, a real-time deep neural network model trained with weak supervision to identify invalid clicks on online ads, deployed on Amazon since 2021. Here, optimal performance of SLIDR on individual traffic slices, with a budget of false positives, was achieved by a convex optimization technique. SLIDR Covex Optimization — System Architecture

Anthropic presents 3 studies highlighting RLHF techniques that help LLMs to morally self-correct themselves Studies tested the effect of natural language instructions on two related but distinct moral phenomena — stereotyping and discrimination. Stereotyping is measured by two well-known benchmarks — BBQ and Windogender. A new benchmark is constructed for measuring discrimination, to test impact of race in a law school admission. Study focussed on decoder-only transformer models fine-tuned with Reinforcement Learning from Human Feedback (RLHF) in function as helpful dialogue models. Study analysed the impact of scale measured in terms of both model size (810M, 1.6B, 3.5B, 6.4B, 13B, 22B, 52B, & 175B parameters) and amount of RLHF training (50 & 100–1000 steps in increments of 100) within the same RLHF training run for each model size.

Alibaba presents Composer- A 5B parameter test-to-image diffusion model optimised for controllability. Recent large-scale generative models suffer from limited control in terms of inputs. Here, controllability is enhanced by a new approach with compositionally as core idea. First, a diffusion model is trained to generate an image with several representative factors as input. Then during inference, these representative factors work as composable elements leading to huge design flexibility. Various levels of conditions, such as text description, depth map and sketch map, color histogram etc. are supported.

1 view


bottom of page