BloomWise: Enhancing problem-solving capabilities of LLMs using Bloom’s-Taxonomy-inspired prompts

Ζουμπουλίδη, Μαρία Ελένη; Zoumpoulidi, Maria Eleni

dc.contributor.author	Ζουμπουλίδη, Μαρία Ελένη	el
dc.contributor.author	Zoumpoulidi, Maria Eleni	en
dc.date.accessioned	2025-05-23T10:27:41Z
dc.date.available	2025-05-23T10:27:41Z
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/61930
dc.identifier.uri	http://dx.doi.org/10.26240/heal.ntua.29626
dc.rights	Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/gr/	*
dc.subject	Μεγάλα γλωσσικά μοντέλα	el
dc.subject	Προτροπές	el
dc.subject	Ταξονομία Bloom	el
dc.subject	Μαθηματικά προβλήματα	el
dc.subject	Large Language Models	en
dc.subject	Bloom's Taxonomy	en
dc.subject	Prompts	en
dc.subject	Math problems	en
dc.title	BloomWise: Enhancing problem-solving capabilities of LLMs using Bloom’s-Taxonomy-inspired prompts	en
heal.type	bachelorThesis
heal.classification	Επεξεργασία Φυσικής Γλώσσας	el
heal.language	en
heal.access	free
heal.recordProvider	ntua	el
heal.publicationDate	2024-12-18
heal.abstract	Η περιορισµένη ικανότητα των Μεγάλων Γλωσσικών Μοντέλων (ΜΓΜ-LLMs) στα µαθηµατικά- δεξιότητα κρίσιµη για την επίλυση σύνθετων πϱοβληµάτων- ϐρίσκεται στο επίκεντρο του ερευνητικού ενδιαφέροντος. Πολλές προσεγγίσεις επιστρατεύουν το in context learning. Οι κυριότερες εξ αυτών αφορούν στην ενθάρρυνση των ΜΓΜ µέσω προτροπών να προσεγγίσουν το πρόβληµα σταδιακά αναπτύσσοντας τη σκέψη τους σε κειµενική µορφή (Chain of Thought-CoT) ή να επιλύσουν το πρόβληµα µε χϱήση κώδικα (Program of Thought-PoT). Ωστόσο, τη µεγαλύτερη ακρίβεια επιτυγχάνουν µέθοδοι οι οποίες ϐασίζονται στην ενσωµάτωση πολλαπλών µεθόδων και επιλογή της κατά περίπτωση κατάλληλης, όπως, παραδείγµατος χάϱιν, η X of Thought (XoT). Στην παϱούσα διπλωµατική εϱγασία, πϱοτείνουµε το BloomWise, µια νέα-εµπνευσµένη από την ταξονοµία Bloom- τεχνική prompting η οποία στοχεύει στη ϐελτίωση των επιδόσεων των ΜΓΜ στην επίλυση µαθηµατικών πϱοβληµάτων ενθαϱϱύνοντάς τα να πϱοσσεγγίσουν το πϱόβληµα επιστϱατεύοντας αϱχικά απλές και πϱοοδευτικά- αν είναι απαϱαίτητο- ανώτεϱες πνευµατικές δεξιότητες. Μέσω εκτεταµένων πειϱαµάτων σε διάφοϱα σύνολα δεδοµένων και µοντέλα, καταδεικνύουµε την αποτελεσµατικότητα της µεθόδου. Επίσης, παϱουσιάζουµε παϱαλλαγές της πϱοσέγγισης, αναδεικνύουµε τη χϱησιµότητα κάθε τµήµατος της µεθόδου µέσω κατάλληλων αφαιϱέσεων και πϱαγµατοποιούµε εµβϱιθή ανάλυση των αποτελεσµάτων εστιάζοντας στην αποτελεσµατικότητα κάθε πνευµατικής δεξιότητας της ταξονοµίας τόσο ανά σύνολο δεδοµένων όσο και ανά µοντέλο. Συνάγουµε συµπεϱάσµατα τόσο για τη µέθοδό µας όσο και για τις ικανότητες των ΜΓΜ. ΄Οσον αφοϱά στη µέθοδό µας, επιτυγχάνει παϱεµφεϱή, και οϱισµένες ϕοϱές µεγαλύτεϱη, ακϱίβεια από τις πϱος σύγκϱιση µεθόδους. Συγκεκϱιµένα, η επίδοση του BloomWise είναι παϱόµοια µε αυτή της ΧοΤ και καλύτεϱη από αυτή των CoT και PoT, ενώ η σηµαντικά µεγαλύτεϱη ακϱίβεια του Oracle καταδεικνύει τη δυναµική της µεθόδου. ΄Οσον αφοϱά στα ΜΓΜ, η µέθοδος πϱοσφέϱει πολύτιµες πληϱοφοϱίες σχετικά µε τις γνωστικές δεξιότητες τις οποίες επιδεικνύει κάθε ΜΓΜ, καθώς και τις δεξιότητες οι οποίες απαιτούνται για την επίλυση διαφοϱετικών τύπων µαθηµατικών πϱοβληµάτων, ενισχύοντας την εϱµηνευσιµότητα. Μεϱικές εκ των κυϱιότεϱων παϱατηϱήσεων είναι οι εξής : σε όλα τα µοντέλα η µεγαλύτεϱη ακϱίβεια επεύχθη στα στάδια ¨Ανάλυση¨ και ¨Κατανόηση¨, και για τα δύσκολα πϱοβλήµατα επιτυγχάνεται καλύτεϱη επίδοση στα ανώτεϱα στάδια της ταξονοµίας, ενώ το αντίστϱοφο δεν επιβεβαιώνεται.	el
heal.abstract	The limited ability of Large Language Models (LLMs) in mathematics—a skill critical for solving complex problems—has garnered significant interest from the research community. Many approaches have employed in-context learning to improve LLMs’ performance in such tasks. The most prominent of these focus on encouraging LLMs, through prompts, to approach problems gradually by developing their reasoning in textual form (Chain of Thought) or solving the problem using code (Program of Thought). However, the highest accuracy is achieved by methods that integrate multiple approaches and select the appropriate one for each case, such as the X of Thought (XoT). In this thesis, we propose BloomWise, a new, Bloom’s- Taxonomy-inspired prompting technique aimed at improving LLMs’ performance in solving mathematical problems. BloomWise encourages models to approach problems initially with simple, and, if necessary, progressively higher cognitive skills. Through extensive experiments on various datasets and models, we demonstrate the effectiveness of the method. Additionally, we present variations of the approach, highlight the usefulness of each component through extensive ablation studies, and conduct an in-depth analysis of the results, focusing on the effectiveness of each cognitive skill in the taxonomy, both by dataset and by model. We draw conclusions both about our method and the capabilities of LLMs. Regarding our method, it achieves accuracy comparable to, and sometimes better than, the methods it was compared against. Specifically, the performance of BloomWise is similar to XoT and better than CoT and PoT, while the significantly higher accuracy in the Oracle setting highlights the method’s potential. As for the LLMs, the method offers valuable insights into the cognitive skills each LLM demonstrates, as well as the skills required for solving various types of mathematical problems, thus enhancing interpretability. Some of the key observations are as follows: across all models, the highest accuracy was achieved at the "Analyzing" and "Understanding" stages, and, while difficult problems achieve better performance at higher taxonomy stages, the reverse does not hold true.	en
heal.advisorName	Ποταμιάνος, Αλέξανδρος	el
heal.committeeMemberName	Ροντογιάννης, Αθανάσιος	el
heal.committeeMemberName	Βουλόδημος, Αθανάσιος	el
heal.academicPublisher	Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Σημάτων, Ελέγχου και Ρομποτικής	el
heal.academicPublisherID	ntua
heal.numberOfPages	111 σ.	el
heal.fullTextAvailability	false