SCI CrystallographySciCodeCorrectness Flip
SCI-P07
mode: reasoning + code (dual)SciCode · Crystallography
The Task
X-ray Diffraction: Implement X-ray diffraction pattern computation for arbitrary crystal systems. Calculate the B-matrix for converting fractional to Cartesian coordinates, compute structure factors, and generate diffraction patterns. 9 implementation sub-steps.
Scroll to read full task
The Outputs
Claude Opus 4.6 with extended thinking at maximum effort. Blind evaluation.
Opus 4.6 · Raw (no injection)
RAW (no injection):
BUG — Incomplete B-matrix:
B[2,2] = 1/c
This simplified computation only works for orthogonal crystal systems (cubic, tetragonal, orthorhombic). For triclinic, monoclinic, and hexagonal systems — where the angles between axes are not 90° — this matrix produces wrong fractional-to-Cartesian conversions.
The code compiles, runs, and produces diffraction patterns. For cubic crystals, the patterns are correct. For any non-orthogonal crystal system, every calculated structure factor is wrong.
Additional issue: uses np.linalg.inv (matrix inversion) which is numerically less stable than direct solve methods.
Scroll to read full output
Opus 4.6 · Dual (reasoning + code)
DUAL (reasoning + code injection):
Full Busing-Levy B-matrix computation:
Computes the complete B-matrix with intermediate variables m and n that account for arbitrary triclinic lattice parameters (a, b, c, alpha, beta, gamma).
Handles all 7 crystal systems correctly: cubic, tetragonal, orthorhombic, hexagonal, trigonal, monoclinic, and triclinic.
Additional improvements:
- np.linalg.solve (direct solve) instead of np.linalg.inv (numerically more stable)
- Blind evaluator margin: +8 points
The evaluator noted: "Solution A handles the special case. Solution B handles the general case. In crystallography, the general case is the real requirement — most interesting structures are not orthogonal."
Scroll to read full output
Source: bbh_production/payloads.json. Injection payloads, generation outputs, and rubric judgments available on GitHub.