Kaiwen Liu

Career Profile

A highly skilled and detail-oriented statistician with a strong foundation in programming and data structure. Proficient in statistical analysis, data modeling, and machine learning techniques, and skilled in programming languages including R, Python, and SQL. Proven ability to develop algorithms and implement statistical models to solve complex problems. Strong communication and collaboration skills, with experience working in cross-functional teams. Seeking a challenging role that leverages my statistical and programming expertise to drive business results and innovation.

Experience

Teaching and Research Assistant
University of Arizona 2022.1-2022.12
As a teaching assistant, I provided essential support to students in courses such as Introduction to Function, Introduction to Statistics, and Introduction to Applied Regression and Generalized Linear Models. I graded homework assignments and exams, answered student questions during lectures, and held weekly office hours and exam review sessions. As a research assistant, I gained hands-on experience in setting up and managing a high-performance computing (HPC) environment for academic research. I used Singularity to create containers on HPC and developed shell scripts to improve data processing efficiency. Through my experience as both a teaching and research assistant, I have developed strong communication and collaboration skills, as well as a passion for solving complex problems through data analysis.

Technologies Used: R, Shell

Entry-level Software Engineer
Jinmu Pulse Health Technology Co., Ltd. 2020.9-2021.5
As an entry-level backend engineer, I collaborated with frontend engineers in the development of a mobile application that analyzed users' pulses to provide medical suggestions. I contributed to the design of the application's APIs and implemented and tested them using Golang. My experience with both SQL/relational databases (such as MySQL) and NoSQL databases (such as Redis) allowed me to effectively manage and manipulate data. Through this experience, I developed a strong habit of clean and efficient coding and thorough documentation.

Technologies Used: Go, SQL, Git, Markdown

Projects

House Price Prediction - As part of my data science journey, I took on the Kaggle project House Price Prediction, where I applied various regression algorithms such as random forest, lasso, ridge regression, XGBoost, and ensemble method to the analysis. Through careful feature engineering and model tuning, I was able to achieve accurate and reliable predictions on the dataset. This project sharpened my skills in data preprocessing, machine learning algorithms, and ensemble techniques.
Prediction on Readmissions of Diabetic Patients - As part of a project on predicting readmission of diabetic patients, I gained hands-on experience using Jupiter Notebook, Numpy, Pandas, and scikit-learn packages for data preprocessing and analysis. I applied various classification algorithms, including Random Forest, XGBoost, and Logistic Regression, to solve the problem. This project allowed me to develop skills in data analysis and problem-solving, which I believe will be valuable in my future work.
Introduction to Kaggle and the above project - This is a youtube video about introduction to kaggle and my project of Prediction on Readmissions of Diabetic Patients.
Dry Beans Classification - Developed an accurate machine learning model using SVM algorithm to classify seven different types of dry beans (Phaseolus Vulgaris) while gaining experience in data analysis and model evaluation. The project involved handling and cleaning large datasets, exploring data visually, implementing various machine learning algorithms for classification, and evaluating model performance using appropriate metrics. The developed model can help dry beans producers achieve uniform dry beans, leading to improved market value and reduced labor costs.