 
 
  
  Apache Texera (Incubating) supports scalable data computation and enables advanced AI/ML techniques.
  
  "Collaboration" is a key focus, and we enable an experience similar to Google Docs, but for data science. 
  
  
- Provide data science as cloud services;
- Provide a browser-based GUI to form a workflow without writing code;
- Allow non-IT people to access data science;
- Support collaborative data science;
- Allow users to interact with the execution of a job;
- Support huge volumes of data efficiently.
The Texera interface supports real-time collaboration on data science projects, allowing seamless sharing of data and workflows with easy access to AI/ML techniques and efficient management of public and private resources.
The workflow in the use case shown below includes data cleaning, ML model training, and validation.

- (5/2025) Responsive Retrieval of Consistent States in Pipelined Executions of Dataflows
 Shengquan Ni, and Chen Li
 To appear in HILDA Workshop at SIGMOD 2025
- (11/2024) IcedTea: Efficient and Responsive Time-Travel Debugging in Dataflow Systems
 Shengquan Ni, Yicong Huang, Zuozhi Wang, and Chen Li To appear in VLDB 2025
- (8/2024) Pasta: A Cost-Based Optimizer for Generating Pipelining Schedules for Dataflow DAGs
 Xiaozhen Liu, Yicong Huang, Xinyuan Lin, Avinash Kumar, Sadeem Alsudais, and Chen Li
 To appear in SIGMOD 2025
- (7/2024) Texera: A System for Collaborative and Interactive Data Analytics Using Workflows
 Zuozhi Wang, Yicong Huang, Shengquan Ni, Avinash Kumar, Sadeem Alsudais, Xiaozhen Liu, Xinyuan Lin, Yunyan Ding, and Chen Li
 In VLDB 2024, Scalable Data Science track | PDF | Slides
- (3/2024) Demonstration of Udon: Line-by-line Debugging of User-Defined Functions in Data Workflows
 Yicong Huang, Zuozhi Wang, and Chen Li
 In SIGMOD 2024 Best Demo Runner-Up Award🏆 | PDF
- (2/2024) Data Science Tasks Implemented with Scripts versus GUI-Based Workflows: The Good, the Bad, and the Ugly
 Alexander K Taylor, Yicong Huang, Junheng Hao, Xinyuan Lin, Xiusi Chen, Wei Wang, and Chen Li
 In DataPlat Workshop at ICDE 2024 | PDF | Slides
Expand All
- (8/2023) Building a Collaborative Data Analytics System: Opportunities and Challenges
Zuozhi Wang, Chen Li
 In Tutorial at VLDB 2023 | PDF | Slides
- (8/2023) Udon: Efficient Debugging of User-Defined Functions in Big Data Systems with Line-by-Line Control
Yicong Huang, Zuozhi Wang, and Chen Li
 In SIGMOD 2024 | PDF | Slides
- (8/2023) Improving Iterative Analytics in GUI-Based Data-Processing Systems with Visualization, Version Control, and Result Reuse
 Sadeem Alsudais Ph.D. Thesis | PDF
- (7/2023) Using Texera to Characterize Climate Change Discussions on Twitter During Wildfires
 Shengquan Ni, Yicong Huang, Jessie W. Y. Ko, Alexander Taylor, Xiusi Chen, Avinash Kumar, Sadeem Alsudais, Zuozhi Wang, Xiaozhen Liu, Wei Wang, Suellen Hopfer, and Chen Li
 In Data Science Day at KDD 2023
- (7/2023) Raven: Accelerating Execution of Iterative Data Analytics by Reusing Results of Previous Equivalent Versions
 Sadeem Alsudais, Avinash Kumar, and Chen Li
 In HILDA Workshop at SIGMOD 2023 | PDF
- (6/2023) Texera: A System for Collaborative and Interactive Data Analytics Using Workflows
 Zuozhi Wang Ph.D. Thesis | PDF
- (12/2022) Towards Interactive, Adaptive and Result-aware Big Data Analytics
 Avinash Kumar Ph.D. Thesis | PDF
- (9/2022) Fries: Fast and Consistent Runtime Reconfiguration in Dataflow Systems with Transactional Guarantees
 Zuozhi Wang, Shengquan Ni, Avinash Kumar, and Chen Li
 In VLDB 2023 | PDF | Slides
- (7/2022) Drove: Tracking Execution Results of Workflows on Large Datasets
 Sadeem Alsudais
 In the Ph.D. Workshop at VLDB 2022 | PDF
- (6/2022) Demonstration of Accelerating Machine Learning Inference Queries with Correlative Proxy Models
 Zhihui Yang, Yicong Huang, Zuozhi Wang, Feng Gao, Yao Lu, Chen Li, and X. Sean Wang
 In VLDB 2022 | PDF
- (6/2022) Demonstration of Collaborative and Interactive Workflow-Based Data Analytics in Texera
 Xiaozhen Liu, Zuozhi Wang, Shengquan Ni, Sadeem Alsudais, Yicong Huang, Avinash Kumar, and Chen Li
 In VLDB 2022 | PDF | Demo Video
- (4/2022) Optimizing Machine Learning Inference Queries with Correlative Proxy Models
 Zhihui Yang, Zuozhi Wang, Yicong Huang, Yao Lu, Chen Li, and X. Sean Wang
 In VLDB 2022 | PDF
- (7/2020) Demonstration of Interactive Runtime Debugging of Distributed Dataflows in Texera
 Zuozhi Wang, Avinash Kumar, Shengquan Ni, and Chen Li
 In VLDB 2020 | PDF | Video | Slides
- (1/2020) Amber: A Debuggable Dataflow system based on the Actor Model
 Avinash Kumar, Zuozhi Wang, Shengquan Ni, and Chen Li
 In VLDB 2020 | PDF | Video | Slides
- (4/2017) A Demonstration of TextDB: Declarative and Scalable Text Analytics on Large Data Sets
 Zuozhi Wang, Flavio Bayer, Seungjin Lee, Kishore Narendran, Xuxi Pan, Qing Tang, Jimmy Wang, and Chen Li
 In ICDE 2017 Best Demo award | PDF | Video
- (2/2025) DS4ALL: Teaching High-School Students Data Science and AI/ML Using the Texera Workflow Platform as a Service
 Jiadong Bai, Xiaozhen Liu, Anthony Cuturrufo, Alexander Kundu Taylor, Jeehyun Hwang, Mingyu Derek Ma, Xinyuan Lin, Yanqiao Zhu, Yicong Huang, Yunyan Ding, Wei Wang, and Chen Li
 To appear in Data Science Education K-12: Research to Practice Annual Conference 2025
- (7/2024) Brain Image Data Processing Using Collaborative Data Workflows on Texera
 Yunyan Ding, Yicong Huang, Pan Gao, Andy Thai, Atchuth Naveen Chilaparasetti, M. Gopi, Xiangmin Xu, and Chen Li
 In Frontiers Neural Circuits | PDF
- (1/2024) Wording Matters: The Effect of Linguistic Characteristics and Political Ideology on Resharing of COVID-19 Vaccine Tweets
 Judith Borghouts, Yicong Huang, Suellen Hopfer, Chen Li, and Gloria Mark
 In TOCHI 2024 | PDF
- (1/2024) How the Experience of California Wildfires Shape Twitter Climate Change Framings
 Jessie W. Y. Ko, Shengquan Ni, Alexander Taylor, Xiusi Chen, Yicong Huang, Avinash Kumar, Sadeem Alsudais, Zuozhi Wang, Xiaozhen Liu, Wei Wang, Chen Li, and Suellen Hopfer In Climatic Change 2024 | PDF
- (11/2023) The Marketing and Perceptions of Non-Tobacco Blunt Wraps on Twitter
 Joshua U. Rhee, Yicong Huang, Aurash J. Soroosh, Sadeem Alsudais, Shengquan Ni, Avinash Kumar, Jacob Paredes, Chen Li, and David S. Timberlake In Substance Use & Misuse 2023 | PDF
Expand All
- (3/2023) Understanding Underlying Moral Values and Language Use of COVID-19 Vaccine Attitudes on Twitter
 Judith Borghouts, Yicong Huang, Sydney Gibbs, Suellen Hopfer, Chen Li, and Gloria Mark In PNAS Nexus 2023 | PDF
- (10/2022) Public Opinions Toward COVID-19 Vaccine Mandates: A Machine Learning-Based Analysis of U.S. Tweets
 Yawen Guo, Jun Zhu, Yicong Huang, Lu He, Changyang He, Chen Li, and Kai Zheng In AMIA 2022 | PDF
- (9/2021) The Social Amplification and Attenuation of COVID-19 Risk Perception Shaping Mask-Wearing Behavior: A Longitudinal Twitter Analysis
 Suellen Hopfer, Emilia J. Fields, Yuwen Lu, Ganesh Ramakrishnan, Ted Grover, Quishi Bai, Yicong Huang, Chen Li, and Gloria Mark In PLOS ONE 2021 | PDF
- (4/2021) Why Do People Oppose Mask Wearing? A Comprehensive Analysis of U.S. Tweets During the COVID-19 Pandemic
 Lu He, Changyang He, Tera Leigh Reynolds, Qiushi Bai, Yicong Huang, Chen Li, Kai Zheng, and Yunan Chen
 In JAMIA 2021 | PDF
- For users, visit Guide to Use Texera.
- For developers, visit Guide to Develop Texera.
Texera was formally known as "TextDB" before August 28, 2017.
This project is supported by the National Science Foundation under the awards IIS-1745673, IIS-2107150, AWS Research Credits, and Google Cloud Platform Education Programs.
- 
 This project is supported by an NIH NIDDK award. This project is supported by an NIH NIDDK award.
- 
Yourkit has given an open source license to use their profiler in this project. 
Please cite Texera as
@article{DBLP:journals/pvldb/WangHNKALLDL24,
  author       = {Zuozhi Wang and
                  Yicong Huang and
                  Shengquan Ni and
                  Avinash Kumar and
                  Sadeem Alsudais and
                  Xiaozhen Liu and
                  Xinyuan Lin and
                  Yunyan Ding and
                  Chen Li},
  title        = {Texera: {A} System for Collaborative and Interactive Data Analytics
                  Using Workflows},
  journal      = {Proc. {VLDB} Endow.},
  volume       = {17},
  number       = {11},
  pages        = {3580--3588},
  year         = {2024},
  url          = {https://www.vldb.org/pvldb/vol17/p3580-wang.pdf},
  timestamp    = {Thu, 19 Sep 2024 13:09:37 +0200},
  biburl       = {https://dblp.org/rec/journals/pvldb/WangHNKALLDL24.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}
