Inter Query and Intra Query Parallelism

Disentangling Inter- and Intra-Video Relations for Multi-Event Video-Text Retrieval and Grounding

Abstract: Video-text retrieval aims to precisely search for videos most relevant to text queries within a video corpus. However, existing methods are largely limited to single-text (single-event) ...

InfoWorld

Multi-token prediction technique triples LLM inference speed without auxiliary draft models

With reported 3x speed gains and limited degradation in output quality, the method targets one of the biggest pain points in production AI systems: latency at scale.

Researchers baked 3x inference speedups directly into LLM weights — without speculative decoding

Researchers from the University of Maryland, Lawrence Livermore, Columbia and TogetherAI have developed a training technique that triples LLM inference speed without auxiliary models or infrastructure ...

IEEE

An intra-string distributed and inter-string decentralized control method for hybrid series-parallel microgrids

Abstract: The hybrid series-parallel microgrid attracts more attention by combining the advantages of both the series-stacked voltage and parallel-expanded capacity. Low-voltage distributed ...

GitHub

PhungTrinhUET/bao-postgresql-reproduction

PostgreSQL baseline (Bao OFF) Bao-enabled (Bao ON): PostgreSQL + pg_bao extension, Bao server for plan selection & reward logging, periodic retraining.

GitHub

[OSDI'25 Artifact] Achieving Low-Latency Graph-Based Vector Search via Aligning Best-First Search Algorithm with SSD

Welcome to the artifact repository of OSDI'25 accepted paper: Achieving Low-Latency Graph-Based Vector Search via Aligning Best-First Search Algorithm with SSD! This repository contains the ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results