
ABOUT
Hi! I'm Supasorn or Aek. I'm now a lecturer at VISTEC, a new research institute in Rayong, Thailand with a really nice, foresty campus. I'm looking for students, RAs/interns, postdocs, and all sorts of collaborators. Shoot me an email if you're interested. Also for PhD, please apply directly here.
News / Talks
- Two papers [1] [2] with one Oral @ CVPR 2024
- One paper [1] @ ICLR 2024!
- Three papers [1] [2] [3] @ ICCV 2023
- Two papers [1] [2] @ CVPR 2023
- One paper [1] @ ICLR 2023
- Area Chair for ICCV 2023
- Outstanding reviewer CVPR 2023
- Best Oral Presentation Award (AI Session) from PMU-B Brainpower Congress 2022.
- Our lab was awarded Google Research Gift!
- Our lab won ThaiSC HPC sponsorship!
- One Oral @ CVPR 2022 (Diffusion Autoencoders)
- Our lab won "AI for All" frontier research grant 2021! (Research and Innovation Thailand PMU)
- Our lab received an overseas Adobe Research Gift!
- Area Chair for CVPR 2022
- Two oral papers @ CVPR 2021; One best paper candidate!
- Our lab won Frontier Research Grant from Research and Innovation Thailand
- One paper accepted to ICDE 2021
- Outstanding reviewer awards: CVPR 2019, CVPR 2021
- KeyPointNet Oral @ NeurIPS 2018, Montreal
- Keynote Speaker @ O'Reilly AI Conference 2018, London
- Digital Thailand Big Bang 2018, Bangkok
- TED 2018, Vancouver
Services
- Area Chair: ICCV 2025, CVPR 2025, ICCV 2023, CVPR 2022
- Reviewer: CVPR (outstanding reviewer x 3), ICCV, ECCV, SIGGRAPH + Asia, TPAMI, IJCV, TOG
- DEI Chair: ACCV 2024
- Program Chair: MLRS 2023
Bio
Before settling down in Rayong, I was a research resident at Google Brain working on geometric understanding in deep learning and image synthesis. I finished my Ph.D. from UW working with Prof. Steve Seitz and Prof. Ira Kemelmacher in Graphics-Vision group GRAIL. My goal is to bring computer vision out of the lab into the real world and make it really work in the wild. I went to Cornell for undergrad, and had a great pleasure working with Prof. John Hopcroft on social graph algorithms, and later got inspired by Prof. Noah Snavely with his computer vision class. I love hacking, coding, tackling hard problems, and I try very hard to make my solutions as simple as possible.
Contact
My first name at gmail. If I don't reply, feel free to ping me. It's likely I lost it in the pile of 30K+ mails.
RESEARCH
![]() |
DiffusionLight: Light Probes for Free by Painting a Chrome BallP. Phongthawee, W. Chinchuthakun, N. Sinsunthithet, A. Raj, V. Jampani, P. Khungurn, S. Suwajanakorn CVPR 2024 (Oral)A novel lighting estimation technique that inpaints a chrome ball into the scene using a pre-trained diffusion model. We propose an iterative inpaiting algorithm and fine-tune Stable Diffusion XL for exposure bracketing. Our method marks the first work that achieves good generalization to diverse, real-world images. PaperPageCode |
![]() |
DNO: Optimizing Diffusion Noise Can Serve As Universal Motion PriorsK. Karunratanakul, K. Preechakul, E. Aksan, T. Beeler, S. Suwajanakorn, S. Tang CVPR 2024A new method that effectively leverages pre-trained text-to-motion diffusion models to optimize diffusion noise for various motion-related tasks, without requiring retraining for each task. DNO allows for efficient motion editing and control, surpassing previous methods in achieving objectives while preserving motion content. PaperPageCode |
![]() |
Diffusion Sampling with Momentum for Mitigating Divergence ArtifactsS. Wizadwongsa, W. Chinchuthakun, P. Khungurn, A. Raj, S. Suwajanakorn ICLR 2024Our study reveals divergence and artifact issues when using high-order methods for low-step diffusion sampling. To tackle this, we introduce two approaches: integrating Heavy Ball momentum to enhance stability and a new method, Generalized Heavy Ball (GHVB), to balance accuracy and artifact suppression. PaperPageCode |
![]() |
DiFaReli: Diffusion Face RelightingP. Ponglertnapakorn, N. Tritrong, S. Suwajanakorn ICCV 2023Our state-of-the-art face relighting technique solves highly challenging scenarios such as cast shadows, strong highlights, unusual makeups, and facial accessories. It only needs 2D images to train---no need for light stage data, multiview images, relit pairs, 3D or lighting ground truth! PaperPageCode |
![]() |
GMD: Guided Motion Diffusion for Controllable Human Motion SynthesisK. Karunratanakul, K. Preechakul, E. Aksan, T. Beeler, S. Suwajanakorn, S. Tang ICCV 2023GMD solves conditional human motion generation based on text prompts, reference trajectories, key locations, etc., without retraining diffusion models for each of these tasks. Our contributions include an effective feature projection scheme, a new imputation formulation, and a dense guidance approach. GMD achieves significant improvements over state-of-the-art methods. PaperPageCode |
![]() |
Zero-guidance Segmentation Using Zero Segment LabelsP. Rewatbowornwong, N. Chatthee, E. Chuangsuwanich, S. Suwajanakorn ICCV 2023We propose a novel problem zero-guidance segmentation---segment everything without "guiding" text queries or prompting. Our first solution to this problem leverages two pre-trained generalist models, DINO and CLIP, without any fine-tuning or segmentation datasets. PaperPageCode |
![]() |
Learning Geometric-aware Properties in 2D Representation Using Lightweight CAD Models, or Zero Real 3D PairsP. Arsomngern, S. Nutanong, S. Suwajanakorn CVPR 2023This study leverages CAD models to improve 2D scene understanding, overcoming the scalability issues of traditional 2D-3D datasets. By using geometrically aligned 3D spaces and pseudo pairs, it achieves comparable results in various tasks while using much less data. PaperPage |
![]() |
StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle TransferS. Khwanmuang, P. Phongthawee, P. Sangkloy, S. Suwajanakorn CVPR 2023A multi-view GAN optimization framework for virtual hair try-on. Our technique supports various highly challenging scenarios, such as transforming a long hairstyle with bangs to a pixie cut. Our method outperforms the state of the art according to the most comprehensive user study. PaperPage |
![]() |
Accelerating Guided Diffusion Sampling with Splitting Numerical MethodsS. Wizadwongsa, S. Suwajanakorn ICLR 2023This paper identifies the inefficacy of existing high-order numerical methods for guided sampling and proposes a solution using operator splitting methods. Our method has the same quality as 250-step DDIM while using 32-58% less sampling time on ImageNet256 and works across various tasks. PaperCode |
![]() |
Toward Ant-Sized Moving Object Localization Using Deep Learning in FMCW Radar: A Pilot StudyN. Kumchaiseemak, I. Chatnuntawech, S. Teerapittayanon, P. Kotchapansompote, T. Kaewlee, M. Piriyajitakonkij, T. Wilaiprasitporn, S. Suwajanakorn IEEE Geoscience and Remote Sensing 2022A deep learning-based approach to localizing a small moving object with a single millimeter-wave frequency-modulated continuous-wave (FMCW) radar. Paper |
![]() |
Diffusion Autoencoders: Toward a Meaningful and Decodable RepresentationK. Preechakul, N. Chatthee, S. Wizadwongsa, S. Suwajanakorn CVPR 2022 (Oral)Our diffusion-based autoencoders can simultaneously infer compact semantic code and stochastic code from an input image. These codes are useful for downstream tasks and allow near-exact reconstruction. Diff-AE can solve tasks like real-image manipulation without GANs and their error-prone inversion and produces more discriminative latent space than StyleGAN-W. PaperPageCode |
![]() |
Nex: Real-time View Synthesis with Neural Basis ExpansionS. Wizadwongsa, P. Phongthawee, J. Yenphraphai, S. Suwajanakorn CVPR 2021 (Oral) - Best paper candidateWe present NeX, a new approach to novel view synthesis based on enhancements of multiplane image (MPI) that can reproduce NeXt-level view-dependent effects---in real time. A 1000x speed up from SOTA. PaperPageCodeData |
![]() |
Repurposing GANs for One-shot Semantic Part SegmentationN. Tritrong, P. Rewatbowornwong, S. Suwajanakorn CVPR 2021 (Oral) / TPAMI 2022We present a simple and powerful method that repurposes GANs for few-shot semantic part segmentation. Our approach achieves surprising and unprecedented performance and is competitive with fully-supervised baselines that require 10-50x more labeled examples. PaperExtended-PaperPage |
![]() |
Self-Supervised Deep Metric Learning for PointsetsP. Arsomngern, C. Long, S. Suwajanakorn, S. Nutanong ICDE 2021 / TPAMI 2021We propose a self-supervised deep metric learning solution for pointsets, which enables effective representation learning on unlabeled datasets. Our key idea is the use of the Earth's Mover Distance (EMD) to generate pseudo labels. Our method produces results superior to other supervised competitors, including an NLP-specific approach based on BERT. PaperJournal |
![]() |
Discovery of Latent 3D Keypoints via End-to-end Geometric ReasoningS. Suwajanakorn, N. Snavely, J. Tompson, M. Norouzi NeurIPS 2018 (Oral)We present KeypointNet, an end-to-end geometric reasoning framework to learn an optimal set of 3D keypoints, along with their detectors. Our model discovers semantically consistent keypoints across viewing angles and object instances and outperforms a fully supervised baseline on the task of pose estimation -- all without keypoint location supervision. PaperPageCode |
![]() |
Synthesizing Obama: Learning Lip Sync from AudioS. Suwajanakorn, S.M. Seitz, I. Kemelmacher-Shlizerman SIGGRAPH 2017 / TED 2018Given audio of President Barack Obama, we synthesize photorealistic video of him speaking with accurate lip sync. Trained on many hours of just video footage from whitehouse.gov, our recurrent neural net approach synthesizes mouth shape and texture from audio, which are composited into a reference video. PaperPageCodeTedTalkDiscussion |
![]() |
What Makes Tom Hanks Look Like Tom HanksS. Suwajanakorn, S.M. Seitz, I. Kemelmacher-Shlizerman ICCV 2015Madrona Prize Winner / Innovation of the Year 2016 We reconstruct a controllable model of a person from a large photo collection that captures his or her persona, i.e., physical appearance and behavior. Our system is based on a novel combination of 3D face reconstruction, tracking, alignment, and multi-texture modeling, applied to the puppeteering problem. PaperPage |
![]() |
Depth from Focus with Your Mobile PhoneS. Suwajanakorn, C. Hernández, S.M. Seitz CVPR 2015We introduce the first depth from focus (DfF) method capable of handling images from mobile phones and other hand-held cameras. With this technique, we can automatically generate a depth map for every photo you take with your phone. PaperSupplementVideoDataPatent |
![]() |
Total Moving Face ReconstructionS. Suwajanakorn, I. Kemelmacher-Shlizerman, S.M. Seitz ECCV 2014 - Madrona Prize Runner-UpOur approach takes a single video of a person's face and reconstructs a high detail 3D shape for each video frame. We target videos taken under uncontrolled and uncalibrated imaging conditions. PaperPage |
![]() |
Illumination-aware Age ProgressionI. Kemelmacher-Shlizerman, S. Suwajanakorn, S.M. Seitz CVPR 2014We present an approach that takes a single photograph of a child as input and automatically produces a series of age-progressed outputs between 1 and 80 years of age, accounting for pose, expression, and illumination. PaperPage |
![]() |
Extracting the Core Structure of Social Networks Using (α, β)-Communities.Liaoruo Wang, John Hopcroft, Jing He, Hongyu Liang, Supasorn Suwajanakorn Internet Mathematics, 2013We present a heuristic algorithm that in practice finds a fundamental community structure and demonstrate that the core structure in social networks is due to underlying social structure rather than high-degree vertices or degree distribution. Paper |
![]() |
Detecting the Structure of Social Networks Using (α,β)-CommunitiesJing He, John Hopcroft, Hongyu Liang, Supasorn Suwajanakorn, Liaoruo Wang 8th Workshop on Algorithms and Models for the Web Graph (WAW) 2011A talk I gave on my algorithm used in the paper (not in WAW). Talk |
WORK EXPERIENCE
![]() |
AI FoundationSince 2017 I'm serving as a member of the Global AI Council, AI Foundation, a San Francisco-based company focusing on applications and platforms enabled by personal AI. |
![]() |
Google Research InternSummer 2015 I spent a wonderful summer working with Michael Rubenstein, Ce Liu, image/video/motion analysis experts, at Google Cambridge/MIT on motion-based 3D reconstruction, and received tremendous help and input from the team with Bill Freeman, Dilip Krishnan, and Inbar Mosseri. |
![]() |
Google Software Engineering internSummer 2013 I lingered around Seattle in 2013 and worked with Carlos Hernandez, a 3D-vision expert, and Steve Seitz, my academic advisor, at Google Seattle in Fremont on uncalibrated depth from focus and eventually published a paper two years later. In 2010 during undergrad, I interned with Harish Venkataramani as a software engineering intern at Google Mountain View on a project related to Gmail and Google+. |
SOURCE CODE
My work consists of multiple large components with ten of thousands lines of code, so this will take time and I may not be able to provide support. The "research-code" is neither polished nor properly commented, but I decided to release it now for research and educational purposes. No commercial use allowed.
Synthesizing Obama: Learning Lip Sync from Audio
Here's the network training code that takes as input processed MFCC coefficients and outputs mouth fiducial points represented as PCA coefficients.
Uncalibrated Photometric Stereo
Future release.
3D Optical Flow
Future release.
PERSONAL
My other interests include: photography, 3D printing, product design, graphics design, software dev, startup, sitting in a hammock, getting lost for fun. I play badminton and squash and sports that involve gliding and wheels such as skiing, snowboarding, skating. I flew off my bike a few times and enjoyed numerous wipe-outs from surfing, but still survived. I love hacking and building things and here are some of things I built.
![]() |
Light Field Bot - 3D PhotosphereSupasorn SuwajanakornAn automatic DSLR rig for capturing 2D/3D 360 photosphere and dense light field for VR. The rig is controlled with Wemos D1 Mini (ESP8266). It has a joystick and OLED screen for displaying menu. The structure is aluminum and 3D printed parts, designed in Fusion 360. Software and hardware are open-source. Thingiverse Stitched OmniStereo 3D Pano |
![]() |
High-Res 360 Spherical CameraSupasorn SuwajanakornA prototype camera that automatically captures a 360 x 180 photosphere similar to Google Street View with resolution up to 100M pixel. It's a low-cost version ~$70 of gigapan made from a Raspberry Pi + custom 3D printed case. The camera rotates around the no-parallax point and is remotely controlled from a smartphone through a dedicated WiFi. Github UW MakerSpace Tour My Boston APT |
![]() |
2D Pattern to 3D Origami PopupSupasorn Suwajanakorn, Jonathan Hirschberg, Roopa Roa, Michael Tomaine Winner - The Faculty Award (Boom 2011) - Cornell UniversityAn application that lets you design a 2D pattern that can be automatically turned into a 3D foldable Origami popup model. Video |
![]() |
Low-noise Stills from VideoSupasorn Suwajanakorn, Andre Baixo Graphics Project - Graphics 2013 - UWA mobile app which combines multiple noisy shots taken hand-held into a single low-noise photo. What makes it special is that you can throw your tripod away and take pictures with your shaky hand and it will work just fine. Report |
![]() |
Rovio & Juliet - Autonomous Indoor Navigation using SURFJae Yong Sung, Supasorn Suwajanakorn, Jong Hwi Lee Best Project Award - Robot Learning 2011Cornell University Our goal is to make Rovio a totally autonomous robot which can follow waypoints only based on image and reach the goal position while learning and avoiding obstacles on its way. Video Report |
![]() |
"Mech Tournament"Supasorn Suwajanakorn, Natachai Laohachai, Poom Pechavanish Best Game Award - 7th Thailand National Software Contest 2005 & Asia Pacific ICT Merit AwardsA 2D multiplayer shooting game based on DirectX 8 and DirectPlay. You get to choose your own robot and enter a battle against other players over a LAN network or against computers with challenging AI. |
![]() |
Classroom ControllerSupasorn Suwajanakorn, Pochara Arayakarnkul Best Application Award - 6th Thailand National Software Contest 2004An application for controlling and monitoring a computer classroom. The teacher will have the ability to inspect students' screens or any running applications, limit internet or application access, issue online quizzes or polls on any machines in the classroom. |
![]() |
Remote Fish FeederSupasorn Suwajanakorn, Naiyarit Sanpol, Thanapol Tanprayoon Winner - National Science Project Contest 2003In grade 9, I built a remote fish feeding device that is connected to a home telephone line. The device can then be activated by calling home from anywhere and press a secret passcode. Fish will thank you while you're sunbathing on a beach. |
![]() |
Google Cardboard Positional Tracking HackSupasorn SuwajanakornI'm trying to add positional tracking to Cardboard App (Full 6DOF). Proper calibration is needed. Tracking is done on computer and position values are sent through USB to the phone (TCP via ADB). Latency is not superb, but should be better than using phone's camera and phone's processing power. Github YouTube |
![]() |
3D Augmented-Reality Graphing CalculatorSupasorn Suwajanakorn, Yu Cheng, Cooper Findley Showcase at Boom 2011 - Cornell UnversityOur fun app that can plot a 3D graph on a piece of paper. During the contest, some number of kids had a lot of fun hand-drawing 2D patterns e.g. hearts, their names, random arts and visualizing 3D surface in real-time. |
![]() |
jitouchSupasorn Suwajanakorn, Sukolsak Sakshuwong Runner-Up - Ars Technica Design Award Student Mac App NYTimes / macworldAn awarding-winning Mac app that expands the set of multi-touch gestures for trackpad, Magic Mouse for frequent tasks such as changing tabs in web browser, closing windows. It also recognizes handwritten alphabets as input gestures. Please check it out! Website |
![]() |
Obox iPhoneSupasorn SuwajanakornA fast-paced puzzle game on iPhone that I made over one summer during an internship (not related). Unfortunately, it's no longer available and I did not have time to update for newer iOS. Video |
![]() |
jiswitchSupasorn Suwajanakornjiswitch is a free Mac application that introduces a new way to switch applications. In a nutshell, it allows users to assign any window a name, and later bring that window to the top whenever the same name is typed. This tool is meant for power-users or coders who have many windows opened. Website |
Designs
![]() |
Boom 2010 LogoSupasorn Suwajanakorn Winner - Science Showcase Logo Contest - Cornell UniversityA logo I designed for Cornell's "annual showcase of student research and creativity in digital technology and applications." I used Sunflow for global illumination rendering and Structure Synth to synthesize the model. -- Earn me an iPod touch |
![]() |
Cornell Class of 2011 LogoSupasorn Suwajanakorn Winner - Cornell UniversityA logo selected to represent Cornell class of 2011. -- Earn me an iPod nano |
![]() |
Cornell Minds Matter LogoSupasorn Suwajanakorn Winner - Cornell UniversityA logo selected to represent Cornell Minds Matter, an organization that promotes the overall mental and emotional health of all Cornell students. -- Earn me an iPod touch |
![]() |
Thai Festival 2010Supasorn SuwajanakornA poster for Cornell Thai Festival 2010. It contains my hand-drawn Thai ancient art called "Lie Kranok" as seen above the text. |
![]() |
Cornell Thai AssociationSupasorn SuwajanakornA logo for Cornell Thai Association. It's the clock tower with a new twist on the roof. |