SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via CI

https://arxiv.org/static/browse/0.3.4/images/arxiv-logo-fb.png
LLM-powered agents are evaluated on SWE-CI, a repository-level benchmark that assesses their ability to maintain code quality over long-term evolution. SWE-CI comprises 100 tasks, each representing an average 233-day evolution history with 71 commits.

Cloud VM benchmarks 2026

https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fihxwrtvu3iezsrccaih2.png
A comprehensive cloud compute VM comparison was conducted, testing 44 instance types across 7 providers, including AWS, GCP, Azure, Oracle, Linode, DigitalOcean, and Hetzner, with a focus on CPU performance and price. The results show AMD EPYC Turin as the top performer, followed by Intel Granite Rapids and Google Axion, with significant performance differences between providers and instance types.

From RGB to L*a*b* color space (2024)

https://kaizoudou.com/wp-content/uploads/2024/07/image-2.png
To assess color accuracy between images, convert them to the Lab color space using the XYZ intermediate space and calculate Delta E (ΔE) for objective comparison. The Lab color space separates lightness (L*) from color information (a* and b*), making it ideal for precise color manipulation and comparison.

"Warn about PyPy being unmaintained"

https://opengraph.githubassets.com/a7c3e32ce5c10a983fd362a0d6820b5b2ffddd24c0280dcca36daed2f90310be/astral-sh/uv/pull/17643
A commit was pushed to the tmeijn/dotfiles repository referencing a pull request that updated the astral-sh/uv package to version 0.9.27.

CasNum

https://repository-images.githubusercontent.com/1155292460/c3a16c6c-63b3-4762-9a3b-9c86b22e748b
CasNum is a library implementing arbitrary precision arithmetic using compass and straightedge constructions, integrating with a modified Game Boy emulator. It features a viewer showing geometric constructions and allows running games like Pokémon Red using only compass and straightedge operations.

MonoGame: A .NET framework for making cross-platform games

https://raw.githubusercontent.com/MonoGame/MonoGame.Logo/refs/heads/master/FullColorOnLight/LogoOnly_128px.png
MonoGame is a .NET framework for creating games across desktop, mobile, and console platforms using C#. It supports various platforms and has a growing list of features including Vulkan and DirectX12 graphics support.

How to run Qwen 3.5 locally

https://unsloth.ai/docs/~gitbook/image?url=https%3A%2F%2F3215535692-files.gitbook.io%2F~%2Ffiles%2Fv0%2Fb%2Fgitbook-x-prod.appspot.com%2Fo%2Fspaces%252FxhOjnexMCB3dmuQFQ2Zq%252Fuploads%252F7H0N7guLeBxQJzMTQeJ4%252FScreenshot%25202026-03-05%2520at%25203.59.09%25E2%2580%25AFAM.png%3Falt%3Dmedia%26token%3D3f3c6c7d-e249-409c-b95b-106e430205ee&width=768&dpr=3&quality=100&sign=7ca8413e&sv=2
Qwen3.5 is a new model family from Alibaba with various sizes and capabilities, including multimodal hybrid reasoning and support for 201 languages. It can be used for tasks like chat, coding, and long-context tasks.

Show HN: Curiosity – DIY 6" Newtonian Reflector Telescope

https://curiosity-telescope.vercel.app/Telescope/Design/04-Optical%20Layout%20cal.png
The image shown below is the most essential and the easiest calculation that one has to look for while building a newtonian reflector telescope. FIG 1.1: Optical Layout and the necessary calculation to make your life easy. Ref: How to build your own telescope by Richard berry In our case, for the distance between the diagonal mirror and the primary mirror, we did not strictly adhere to the ...

Emacs internals: Deconstructing Lisp_Object in C (Part 2)

The author discusses how they approach reading source code, starting from general computation and data representation, and applies this to the GNU Emacs source code, which uses a tagged pointer technique to represent Lisp values in C. This technique is a universal pattern in systems programming, allowing metadata to be stored in unused bits of pointers, and is used in Emacs to implement ...

A decade of Docker containers

The website is temporarily blocked due to security reasons after a suspicious action was detected. Please email the site owner with the Cloudflare Ray ID and details of the action that triggered the block.

Dumping Lego NXT firmware off of an existing brick (2025)

The user contributed to the Pybricks project and obtained a used Lego NXT with the original 2006 firmware version 1.01, which they wanted to archive, leading to the discovery of arbitrary code execution. They successfully exploited the NXT's firmware to gain native ARM code execution, allowing them to access and dump the firmware, and potentially enabling the creation of an NXT worm.

Yoghurt delivery women combatting loneliness in Japan

https://ichef.bbci.co.uk/images/ic/480xn/p0n451bl.jpg.webp
In Japan, a network of women delivering probiotic milk drinks, known as Yakult Ladies, has become a vital source of connection and care for the elderly. These women, who are often self-employed, offer a lifeline of human connection and help reduce loneliness in a rapidly ageing population.

A Grand Vision for Rust

https://blog.yoshuawuyts.com/twitter-card.png
The author wants Rust to become the safest production-grade language by introducing new abstractions and type systems, such as ordered types and refinement types. They aim to improve the language's borrow checker story and memory safety guarantees through features like pattern types and view types.

Show HN: A weird thing that detects your pulse from the browser video

https://pulsefeedback.io/og-image.png
This page responds to your pulse through your camera. No one can see you. Only your heart rate is shared.

Autoresearch: Agents researching on single-GPU nanochat training automatically

https://raw.githubusercontent.com/karpathy/autoresearch/master/progress.png
A researcher, @karpathy, created a project to let AI agents experiment autonomously with a simplified LLM training setup, modifying code and training for 5 minutes at a time. The project uses a single file, train.py, which the agent edits, and a program.md file that humans edit to set up the research org.

The surprising whimsy of the Time Zone Database

https://muddy.jprs.me/media/20260306-203048.png
The author learned to handle time zones by using the IANA Time Zone Database, a resource built by others, rather than writing custom code. The database contains a rich history of time zone changes and whimsical comments.

Best performance of a C++ singleton

https://andreasfertig.com/img/sherlock.png
The user discusses implementing a singleton with performance in mind, comparing two approaches: using a block local static variable and a private static data member. The private static data member approach is recommended for better performance when a constructor is needed.

In 1985 Maxell built a bunch of life-size robots for its bad floppy ad

https://assets.buttondown.email/images/d74314eb-fcaf-42b2-9927-c01dbf847780.png?w=960&fit=max
Maxell's 1980s ads featured robots eating floppy disks, but the company also created life-size robot props that were displayed in a museum exhibit. The robots were part of a Smart Machines exhibit at The Computer Museum in Boston, which opened in 1987.

Digital Iris [video]

New Research Reassesses the Value of Agents.md Files for AI Coding

https://imgopt.infoq.com/fit-in/100x100/filters:quality(80)/articles/read-copy-update/en/smallimage/read-copy-update-thumbnail-1772618412119.jpg
Despite widespread industry recommendations, a new ETH Zurich paper concludes that AGENTS.md files may often hinder AI coding agents. The researchers recommend omitting LLM-generated context files entirely and limiting human-written instructions to non-inferable details, such as highly specific tooling or custom build commands. The team (Thibaud Gloaguen, Niels Mündler, Mark Müller, Veselin ...

Ten years of deploying to production

The user worked in a company in 2018 where the operations team was responsible for production deployments, which happened only every two weeks, causing delays in fixing issues. The user implemented a DevOps solution, creating an internal PyPi repository and establishing a pattern of versioning and code review, which improved the developer experience and reduced friction in production changes.

FLASH radiotherapy's bold approach to cancer treatment

https://spectrum.ieee.org/media-library/photo-of-a-man-in-a-lab-coat-adjusting-a-large-piece-of-medical-equipment-thats-pointed-at-the-head-of-a-partial-mannequin.jpg?id=65111419&width=1200&height=913
Physicists at CERN and other labs are developing FLASH radiotherapy, a new cancer treatment that delivers high doses of radiation in a short burst, reducing damage to healthy tissue. Researchers are refining the technology and expect it to become a routine clinical option in about 10 years, potentially transforming cancer care worldwide.

To the Polypropylene Makers

https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/HQTueNS4mLaGy3BBL/p7iaiua4zcd1zfeyrxqd
During the COVID-19 pandemic, Braskem America workers volunteered to live in factories for 28 days to produce polypropylene for N95 masks, producing 40M pounds. They were paid full wages and a week off after, showing how creative thinking can fill vital gaps in emergencies.

macOS code injection for fun and no profit (2024)

The user discusses Live++ by Molecular Matters, a C/C++ hot-reload/live coding solution, and shares a project to inject code into a running process on macOS using Mach APIs. The project involves modifying a test program's memory, allocating executable memory, and setting up a trampoline to replace a function with new code.

Lisp-style C++ template meta programming

https://opengraph.githubassets.com/8deb14cd2cf854c003125cdb4d7bf1ed2c4a972054877790b2b7d24fe0d61621/mistivia/lmp
The code implements a prime number sieve using infinite integers and lazy evaluation. It generates a list of prime numbers starting from 2.

Files are the interface humans and agents interact with

https://avatars.githubusercontent.com/u/25641936?v=4
The author, a former vector database company employee, notes a shift in the AI ecosystem towards using filesystems for context and memory, citing various companies and researchers adopting this approach.

How important was the Battle of Hastings?

This website is using a security service to protect itself from online attacks. We are checking your browser to establish a secure connection and keep you safe.

Compiling Prolog to Forth [pdf]

LLM Writing Tropes.md

A single file containing AI writing tropes was created to help AI assistants avoid common patterns in writing, such as overused adverbs, grandiose nouns, and false suspense transitions. The file lists various tropes to avoid, including negative parallelism, superficial analyses, and invented concept labels, to help AI writers produce more human-like and engaging content.

Re-creating the complex cuisine of prehistoric Europeans

https://cdn.arstechnica.net/wp-content/uploads/2026/03/cuisine2CROP-640x562.jpg
Archaeologists analyzed residues on prehistoric ceramic pots and found evidence of diverse diets combining plants and animals in ancient Eastern European populations. They discovered region-specific recipes, including fish with wild grasses and legumes in one area and fish with green vegetables in another.