school

UM E-Theses Collection (澳門大學電子學位論文庫)

check Full Text
Title

結合脆弱因子的偏最小二乘方法在乳腺癌生存分析中的應用

English Abstract

Objective: This study aimed to evaluate the prognosis effect of patients with breast cancer with partial least squares and frailty factors (PLSF) in combination with Cox model (PLSF-Cox). Methods: Two real breast cancer datasets and two simulated datasets were analyzed for the prognosis by the Cox model, partial least squares combined with Cox model (PLS-Cox) and PLSF-Cox model. The maximum amount of data that could be handled by the PLSF model and the use of frailty factors were analyzed. The number of components from dimension reduction, computation time of modeling and the discrimination effect after regression of each model were determined and evaluated. Results: PLSF performed well in dimension reduction. PLS-Cox and PLSF-Cox outperformed Cox model. PLSF-Cox was more stable than other two models. The maximum ratio of gene variable numbers and sample sizes were 55 and 83 with the largest internal difference between the components, respectively. The appropriate ratios were 0.5 or 1. The same results of the dimension reduction and discrimination after regression were found in the first simulated dataset and real dataset. In the second simulated dataset, the PLSF-Cox model could not work when the frailty factor was censoring. Age group, grade and tumor size group were adopted as the frailty factor in PLSF-Cox. The non-inferiority of the replacement reflected the feasibility of the PLSF-Cox. Conclusion: PLSF-Cox has a good overall performance, despite limitations in the data capacity. The ratios of genes numbers and sample sizes at 0.5 and 1 were tested to be adequate for PLSF-Cox analysis. The frailty factor in PLSF-Cox model could be replaced by other clinical indexes.

Chinese Abstract

目的:通過真實數據和模擬數據對以 PLS 回歸降維為基礎的 Cox 回歸模型,尤 其是加入了脆弱因子的 PLSF-Cox 模型對乳腺癌病人預後風險預測的效果਀其 可能出現的結果進行深入的評價分析;對模型所能處理的最大數據量以਀脆弱 因子的使用情況進行初步的探索性分析。 方法:基於 R ਀ RStudio 軟件,應用相關程序包中的函數,編寫 R 程序,對真 實數據和模擬數據應用 PLSF-Cox 模型,進行計算和結果分析。在研究中記錄 每個模型相對應的結果,例如降維后得到的成分數目、運算時間等,並將各個 結果進行綜合考慮,得出最終的結論。 結果:在模型降維階段,PLSF 模型表現出較好的降維效果,能夠大大降低建模 所需要的成分數目;在模型回歸階段,PLS-Cox 模型和 PLSF-Cox 模型得到的 區分效果均優於 Cox 模型,且模型 PLSF-Cox 模型的穩定性最高。在兩個真實 數據集的運算中,本研究環境下能計算到的最大基因變量數分別為 55 倍和 83 倍病人數目,但得到的成分內部差異性較大,而當基因變量數目和病人數目的 比值為 0.5 和 1 時的數據容量較為合適。模拟数据集 1 的回歸降维的结果与两 个真实数据集的结果相似。在模擬數據集 2 中,運用删失情况作为脆弱因子构 建的 PLSF-Cox 模型無法運行,運用年齡分組、分級和腫瘤大小分組作為替換 的脆弱因子構建的 PLSF 模型能夠得到不劣於以刪失情況作為脆弱因子的效果, 體現了原有模型的局限性和因子替換的可行性。 結論:研究中涉਀到的 PLSF-Cox 模型在應用中對於所計算的數據容量有一定 的限制,基因變量數同病人數的比值較小時,得到的結果較為理想;模型中脆 弱因子可在不同的情況下予以替換,且替換是具有非劣效性的。 關鍵詞:偏最小二乘法,脆弱因子,乳腺癌,模擬數據

Issue date

2015.

Author

朱亞楠

Faculty

Institute of Chinese Medical Sciences

Degree

M.Sc.

Subject

Least squares

最小二乘法

Breast -- Cancer

乳房 -- 癌症

Supervisor

梁少偉

Files In This Item

Full-text (Intranet only)

Location
1/F Zone C
Library URL
991000692159706306