Sam Xu

Xiang Xu (Sam)

I am interested in multimodal agents and domain-specific expert models for design and manufacturing, with a focus on applications in CAD, CAE, CAM, and AEC. Previously, I was a Principal Research Scientist at Autodesk AI Lab, where I worked on vision-language models and SDF-based methods for CAD generation. I completed PhD in CS from Simon Fraser Univeristy, supervised by Yasutaka Furukawa. And BSc in ECE from Carnegie Mellon University, advised by Kris Kitani.

Education
  • PhD in CS, 2021 - 2024

    Simon Fraser University

  • MSc in CS, 2019 - 2021

    Simon Fraser University

  • BSc in ECE, 2014 - 2018

    Carnegie Mellon University

Experiences

Publications

See Google Scholar for full publications
DualBrep: A Dual-Field Continuous Representation for B-rep Modelling

DualBrep reformulates CAD models into a fully continuous domain by encoding geometry as a Shape field and topology as a Generalized Voronoi Diagram field. These dual fields are compressed into a single, unified latent representation, enabling robust downstream tasks such as conditional generation and deterministic reverse engineering. A learned rebuilder extracts explicit, watertight B-rep models directly from these continuous signals.

BRepAssembler: B-Rep Assembly Generation with Latent Edge Reasoning

Given an image or a complexity specification, BRepAssembler directly generates high-quality CAD assemblies as multi-body B-Reps. Our approach is more than x2 faster than previous autoregressive methods using latent edge representations.

BRepFacetGen: Reverse Engineering B-Reps by Generative Face Segmentation

We adapt a pretrained SDF-based latent set with geometry-conditioned latent space to get segmentated meshes. The facted mesh is then reconstructed back into watertight B-Reps.

B-Rep Distance Functions (BR-DF) How to Represent a B-Rep Model by Volumetric Distance Functions?

BR-DF is a geometric representation for Boundary Representation (B-Rep) models. An SDF encodes surface geometry. UDFs encode vertices, edges, faces, and their connectivity. An extension of the Marching Cubes converts BR-DF to a faceted B-Rep model.

AutoBrep: Autoregressive B-Rep Generation with Unified Topology and Geometry

AutoBrep is a unified autoregressive Transformer that progressively generates B-Rep geometry and topology discrete tokens following a breadth-first traversal of the face adjacency graph.

HoLa: B-Rep Generation using a Holistic Latent Representation

A unified BRep variational encoder (VAE) to encode a BRep's topological and geometric information into a holistic latent space, and a latent diffusion model generate such latent from multiple modalities

BrepGen: A B-rep Generative Diffusion Model with Structured Latent Geometry

A diffusion-based generative approach that directly outputs a CAD B-rep. We represent a B-rep as a novel structured latent geometry tree format. B-rep topology is implicitly represented by node duplication.

Hierarchical Neural Coding for Controllable CAD Model Generation

Represent high-level CAD design concepts as a hierarchical tree of neural codes. User controls the generation or auto-completion of CAD models by specifying the target design using a code tree.

SkexGen: Autoregressive Generation of CAD Construction Sequences with Disentangled Codebooks

Using disentangled codebooks to generate diverse and high-quality CAD models, enhances user control, and enables efficient exploration of the CAD design space.

Structured Outdoor Architecture Reconstruction by Exploration and Classification

An explore-and-classify framework for building architectural reconstruction. Our method explores the structure space by heuristic modifications and classifing the correctness of updated results.

D3D-HOI: Dynamic 3D Human-Object Interactions from Videos

Monocular video dataset with ground truth annotations of 3D object pose, shape and part motion. We leverage 3D human pose for more accurate inference of the object spatial layout and dynamics.

MCMI: Multi-Cycle Image Translation with Mutual Information Constraints

Treat single-cycle image translation as modules that can be used recurrently where the process is bounded by mutual information constraints between the input and output images.

Error Correction Maximization for Deep Image Hashing

We use the Hamming bound to derive the optimal criteria for learning hash codes with a deep network.