import './Blog.css'
import {Button, Col, Container, Row} from "react-bootstrap";

export default function Blog(props) {
    return (
        <Container>
            <Row style={{display:"block"}} className={"entry"}>
                <h1> Knowledge Graphs in Machine Translation </h1>
                <p>
                    With nearly 100 languages with over 10 million speakers, the task of translating all pairs of
                    languages, with their own idiosyncrasies, remains a large task. Over the last half-decade,
                    progress has been made in machine translation, and recently this category of language translation
                    has been dominated by neural machine translation. Large improvements in natural language models,
                    such as RNNs, LSTMs, and more recently transformers, mean models exploiting these strategies now
                    top the rankings for machine translation tasks. With renewed exploration into the use of knowledge
                    graphs in deep learning systems, there is a resulting rise in exploring the use of knowledge graphs
                    in machine translation. As natural language translation is tied heavily to entities and their
                    corresponding relationship across languages, the use of knowledge graphs can improve machine
                    translation when this contextual understanding can be exploited. We answer this question: what is
                    the current state of machine translation and what difficulties in machine translation do knowledge
                    graphs solve.
                </p>
                <h3>Introduction</h3>
                <h5>The Problem</h5>
                <p>
                    As technologies increases the communication, the amount of multi-language communication continues
                    to increase. Companies like Google have offices around the world and produce content for users
                    across the world that need localization.
                </p>
                <figure style={{width: '50%'}}>
                    <img src={'google_locations.png'} alt={'Google office locations 2021'}/>
                    <figcaption>
                        <div className={'credit'}>
                            [3] “Browse a list of Google's Office Locations”
                        </div>
                        <div className={'description'}>
                            Fig.1 - Google office locations 2021
                        </div>
                    </figcaption>
                </figure>
                <p>
                    Additionally, this has been accelerated by a move towards virtual work and content that facilities
                    an increase in multi-language interaction and exchanges that need translation. This has created an
                    unprecedented amount of content that needs translation, so much so, that we will see that it would
                    be unfeasible to translate even a small section of this content manually.
                </p>
                <figure style={{width: "50%"}}>
                    <img src={'wikipedia_growth.png'} alt={'Growth of wikipedia from 2001'}/>
                    <figcaption>
                        <div className={'credit'}>
                            [14] “Modelling Wikipedia's growth”
                        </div>
                        <div className={'description'}>
                            Fig.2 - Growth of wikipedia from 2001
                        </div>
                    </figcaption>
                </figure>
                <p>
                    Wikipedia host encyclopedic about topics in a multitude of languages. The goal of the service is to
                    make the knowledge hosted there as accessible as possible, this includes translating articles into
                    the language of the user. This type of moderated/hosted content is one type of content that needs
                    translation another is the content that users' share between themselves such as YouTube videos.
                    Youtube represents a large amount amount of content, with more than 500 hours of video being
                    uploaded each minute [16]. YouTube services almost every country in the world and provides subtitle
                    translations for over 100 languages.
                </p>
                <figure style={{width: '60%'}} >
                    <img src={'youtube_growth.png'} alt={'Growth of youtube hours of content uploaded per minute'}/>
                    <figcaption>
                        <div className={'credit'}>
                            [16] “YouTube: hours of video uploaded every minute 2019”
                        </div>
                        <div className={'description'}>
                            Fig.3 - Growth of youtube hours of content uploaded per minute
                        </div>
                    </figcaption>
                </figure>
                <p>
                    With this large amount of content that needs to be translated, manual translation methods can not
                    keep up. This leads to the use of computers for this translation task. The industry of machine
                    translation is on the rise, and projected to nearly triple over an eight year period [9]. These forms
                    of machine translation include both translation assistance tools and fully automated tools and APIs.
                </p>
                <figure  style={{width: '70%'}}>
                    <img src={'market_growth.png'} alt={'Projected growth of machine translation from 2019-2026'}/>
                    <figcaption>
                        <div className={'credit'}>
                            [9] “Insights on the Machine Translation Market 2020-2024”
                        </div>
                        <div className={'description'}>
                            Fig.4 - Projected growth of machine translation from 2019-2026
                        </div>
                    </figcaption>
                </figure>
                <p>
                    Another more topical discussion around machine translation is the current COVID-19 pandemic. The
                    pandemic has created the need for rapid communication between the international medical community
                    [9]. French medical research needs to be quickly translated to all relevant languages and
                    vice-versa. Due to the use of domain specific knowledge in this research it represents a space were
                    knowledge graphs may thrive.
                </p>
                <h3>Background</h3>
                <h5>Language and Knowledge</h5>
                <p>
                    Languages have a large tie to knowledge that we can notice without a large consideration of each.
                    A language exists a way to exchange or store information. We can thus think of language as a way of
                    encoding knowledge in a way that allows for future human extraction. For example in Fig.5, if we say
                    that "Dante was born in Florence", we encode knowledge about the relationship between Dante and
                    Florence [13]. These are the exact type of relationship that knowledge graphs are used to encode.
                    This relationship has been studied and there has been recent success in extracting the knowledge
                    from the languages directly [12,13]. This has been done by using a language model, a model that encodes
                    the generation of a language, to predict the relationships instead of a knowledge graph.
                    Additionally methods that directly attempt to extract knowledge graph structure from the model
                    itself have shown success.
                </p>
                <figure style={{width: '60%'}}>
                    <img src={'knowledge_language.png'} alt={'Relationship between knowledge and language'} />
                    <figcaption>
                        <div className={'credit'}>
                            [13] “Language Models as Knowledge Bases?”

                        </div>
                        <div className={'description'}>
                            Fig.5 - Relationship between knowledge and language
                        </div>
                    </figcaption>
                </figure>
                <p>
                    More relevant to us is that this relationship exists. It means that knowledge graphs and languages
                    are tied together and that understanding knowledge via a knowledge graph, gives understanding of
                    the language itself. This suggests that knowledge graphs could be useful in translating these
                    languages from one to another, since we can use our better knowledge understanding to improve
                    translation.
                </p>
                <h5>History of Machine Translation</h5>
                <p>
                    Machine Translation has gone through many phases with varied levels of success. The first of these was a
                    simple word-to-word translation method. Essentially, direct mappings from a word in one language to
                    another were constructed to do simplistic translations. These methods require mapping each word in one
                    language manually, and fail to include any understanding of the syntax of each language or the issues
                    that arrive in mapping languages. The most simple of these is that languages just don't work this way.
                    Each word in Japanese does not map directly to an equivalent English word, and so this method fails.
                </p>

                <p>
                    The next method that appeared was rigorous linguist categorization of languages in an attempt to
                    develop a system of language understanding and subsequently creating translators that operate using
                    mapping between these patterns in each language. For example, a common English question fits into
                    the pattern "Is this a _____?". Similarly, we can create a pattern for a similar Japanese question
                    "これは_____ですか". Then we can map English sentences of this form by just filling in the appropriate
                    word for the blank. This grew and the rules and patterns grew, utilizing linguistic expertise to
                    develop structural mappings for pairs of languages.
                </p>
                <p>
                    Next came statistical attempts to translate languages. This method is pretty clear, and is what was
                    the precursor to the current neural methods. The idea is that we can exploit the fact that language
                    structure is not random to learn parameters that minimize the statistical risk of an incorrect
                    translation. These statistical methods are essentially all exploiting structural relationships in
                    language (the lack of randomness) to generate statistically favorable translations. With the advent
                    of neural networks and the ability to create models with millions (or billions) of parameter, these
                    rule based methods have essentially been abandoned in favor of statistical ones. Early statistical
                    methods were very simplistic, for example maximizing character level likelihood.
                </p>
                <p>
                    For the last several years, the methods used to perform these statistical prediction have massively
                    grown in number of parameters. This growth in parameter size is facilitated by deeper model, neural
                    networks that train on massive datasets of text. Early on these models were simple fully connected
                    neural networks, but methods that exploit the structure of languages were developed to improve deep
                    learning performance. One of these were recurrent neural networks, that used the temporal nature of
                    language to reduce the need parameters by recursing a set of weights and a built context to each
                    processed word instead of having fully connected weights for the entire sentence. This was improved
                    into a RNN that includes a notion of short-term and long-term memory, long short-term memory (LSTM)
                    networks. These models act similarly to RNNs but include extra logic to pass context as either
                    short-term or long-term, and quickly topped the charts on machine translation tasks.
                </p>
                <p>
                    These LSTM models were augmented with a mechanism called attention to allow for the models to
                    learn weights that could determine the importance of each word in the sentence. This mechanism saw
                    success but then in the revolutionary paper, "Attention is all you need", it was shown that just
                    learning stacks of these weights showed comparable or better performance to using attention to
                    assist RNNs [19]. These models that only learn stacks of the attention weights are called
                    Transformer, and this method is currently the one that tops the leader boards for translation tasks.
                    Since RNNs are much more efficient at inference time, they are still widely used in real systems
                    [4].
                </p>

                <h5>Current Problems in Machine Translation</h5>
                <p>
                    There a few issues that still give modern machine translation models issues. These issues often take
                    one of two similar forms: rare words or lost entity relationships [1, 4]. These are both simple examples to
                    understand, but each give models problems. For rare words, we are left with even fewer translation
                    pairings, and so the model has fewer comparative samples to learn the translation. This also means
                    the model has little motivation to learn these pairings, as optimizing these rare cases will do much
                    less to minimizing the loss than mastering the very common cases.
                </p>
                <figure style={{width: '70%'}}>
                    <img src={'rare_words_french.png'} alt={"Example of French translation with rare words"}/>
                    <figcaption>
                        <div className={'credit'}>
                            [1] “Addressing the Rare Word Problem in Neural Machine Translation”
                        </div>
                        <div className={'description'}>
                            Fig.6 - Example of French translation with some rare words that don't get learned by the
                            model
                        </div>
                    </figcaption>
                </figure>
                <p>
                    A good example of words that have this issue are only common in specific domains, such as
                    anti-reductionisms or penumbra. The similar issue, missing entity relationships comes about when the
                    machine translator loses a entity relationship in a language. This leads to knowledge loss in the
                    model, which can then reduce quality of the translations.
                </p>
                <p>
                    For example, the sentence
                </p>
                <p style={{'text-align': 'center'}}><b> Joe Biden lives in the White House </b></p>

                <p>
                    contains knowledge about the relationship about the relationship between Joe Biden and the White
                    House, that a reader can interpret even if the proper noun White House is not denoted as such.
                    The knowledge integral to correctly translating this is that Joe Biden is the president, the
                    president resides in the White House, therefore the writer most likely means that Joe Biden lives in
                    the White House. With this knowledge when translating to Japanese, we would want
                </p>
                <p style={{'text-align': 'center'}}>
                    <b>ジョーバイデンは<span style={{color:'#00FF00C0'}}>ホワイトハウス</span>に住んでいます</b>
                </p>
                <p>
                    instead of the incorrect
                </p>
                <p style={{'text-align': 'center'}}>
                    <b>ジョーバイデンは<span style={{color:'#FF0000C0'}}>白い家</span>に住んでいます</b>
                </p>

                <p>
                    which instead translate to a statement that Joe Biden lives in a white house. This issue is similar
                    to the rare word issue in that it results as a consequence of rare entity relationships in the data.
                    Both issue cases center around the machine translator having trouble with knowledge. In a real
                    sense, the machine translator losses knowledge encoded in language; however, this knowledge is not
                    immediately present in the individual sentence. We use external knowledge about the president to
                    successfully decode the information in the sentence. This suggest that maybe knowledge graphs may
                    help alleviate this struggle for the machine translators by acting as this external source of
                    knowledge we implicitly reference when interacting with language example. While this connection
                    seems clear, it is not the only strategy used to alleviate these issue. Indeed, methods that don't
                    rely on an external knowledge graph exist, and recently were what powered Google Translate [4].
                </p>

                <h3>State of the Art</h3>
                <p>
                    The proceeding is essentially a condensed version of [4], which is a 2020 report summarizing the
                    approaches used by Google Translate up to 2020. Google releases these reports when they believe that
                    keeping a method secret is no longer beneficial, for example when they transfer to a new method.
                    Thus, there is a high chance that these methods have been improved in Google's current translation
                    model; however their one year old method still boast impressive results.
                </p>
                <h5>Data Improvements</h5>
                <p>
                    The first solution that Google Translate uses to combat these issue is one that helps almost any
                    deep model: data improvement. Google Translate uses a massive compute cluster to scrap
                    text data from the web on sites that have multiple language version of the same text. Dor example
                    news sites often contain the same article in multiple languages and so make a good target for this.
                    The quantity of data in general improves deep model's performance, and the team found that the
                    model better learns rare words if the amount of all word occurrences, including the rare words,
                    is huge. They don't only improve performance by increasing the quantity of data; the team also uses
                    methods to improve the quality of the samples. For example, they optimize models with strategies to
                    favor quality data and use denoising techniques to determine if data is likely of poor quality.
                </p>
                <h5>Training Improvements</h5>
                <p>
                    Two other methods that are used to help with rare words / rare entity relationships are training
                    augmentations that help in the cases that this large amount of data cannot be obtained. The first of
                    these is M4 model, which allows for you to boost language pairings that have low pairings by
                    simultaneously learning how translate from multiple languages into the original target language.
                    This a method of transfer learning and has seen success across machine learning.
                </p>
                <p>
                    The next method that they use in conjunction with M4 modeling is back-translation. This allows for
                    the training of reverse translator that goes from the target of translator to the source language of
                    the translator by using the original model to generate samples. This generates artificial training
                    data, and so allows for more samples than exist in any actual dataset.
                </p>
                <p>
                    Together an example pipeline for a English to Yiddish translator would
                    be first simultaneously training several models that convert from many languages to English; thus
                    exploiting M4 modeling to boost the performance of the low data pairings English-Yiddish with the
                    high data pairings for languages like English-Spanish and English-German. Now that you have a
                    boosted Yiddish into English model, you would use back-translation to train an English into Yiddish
                    translator, using data generated by the Yiddish to English translator.
                </p>
                <h5>Other Improvements</h5>
                <p>
                    One important part of the Google Translate training process is that experts review some of the
                    produced samples, confirming or correcting the model. The model then learns from this feedback, and
                    retrains based upon the reinforcement it receives. This allows for the model to continuously improve
                    it's predictions and dataset.
                </p>
                <h3>Use of Knowledge Graphs</h3>
                <p>
                    Here we will go over three ways that knowledge graphs are used to boost translation models. The first of
                    these uses a method to extract the entity links for a given word in the sentence as a set of
                    continuous embeddings for model. Since a graph encodes geometric information about entity
                    relationships, this can be converted into the continuous geometric representation given by embeddings
                    using a projection mapping. Essentially this adds the knowledge contained in the knowledge graph
                    for each word in the sentence, which in turn injects the external knowledge from the knowledge
                    graph into the input for the model.
                </p>
                <figure style={{width: '70%'}}>
                    <img src={'knowledge_ex1.png'}
                         alt={'Architecture summary for "Knowledge Graphs Effectiveness in Neural Machine Translation Improvement"'}/>
                    <figcaption>
                        <div className={'credit'}>
                            [10] “Knowledge Graphs Effectiveness in Neural Machine Translation Improvement”
                        </div>
                        <div className={'description'}>
                            Fig.7 - Architecture Summary
                        </div>
                    </figcaption>
                </figure>
                <p>
                    This method is used in both the architectures in figures 7 and 8. This allows for the external knowledge
                    in the knowledge graph to be used directly by the model. Another methods adds a knowledge loss to
                    training process to determine how much knowledge is lost by the model. This method uses two knowledge
                    graphs, one in the source language and one in the target language, to determine how many entity
                    relationships were lost. The easiest way to see this is to see the steps
                </p>

                <p>
                    <ul>
                        <li>Generate a list of entity relationships for the source sentence from the knowledge graph</li>
                        <li>Map the sentence using a translator</li>
                        <li>Generate a list of entity relationships for the target sentence from the knowledge graph</li>
                        <li>Check that entity relationships match the original and penalize failures</li>
                    </ul>
                </p>
                <p>
                    This method checks that entity relationships are maintained in the translation process and is used to
                    solve the knowledge loss problem mentioned previously. An example of this can be seen in the
                    architecture in figure 8.
                </p>
                <figure style={{width: '70%'}}>
                    <img src={'knowledge_ex2.png'} alt={'Architecture summary for "Utilizing Knowledge Graphs for Neural Machine Translation Augmentation"'}/>
                    <figcaption>
                        <div className={'credit'}>
                            [18] “Utilizing Knowledge Graphs for Neural Machine Translation Augmentation”
                        </div>
                        <div className={'description'}>
                            Fig.8 - Architecture Summary
                        </div>
                    </figcaption>
                </figure>
                <p>
                    The final method is just a variant of these where the knowledge graph is built to specifically contain
                    domain specific knowledge. This means building a knowledge graph with relationships for a specific
                    field, such as medicine. This allows for a generalized model to use specialized knowledge as an
                    augmentation for a particular dataset, instead of training a new translator on a specific dataset. This
                    allows for the use of pretrained models on specialized translation tasks.
                </p>

                <h5>BERT as a source of knowledge</h5>
                <p>
                    Another way to use knowledge in machine translation models is to attempt to extract the knowledge
                    contained in BERT or other language models. By doing this you remove the need to build or use a
                    structured  knowledge base for you particular task, and can instead use a pretrained or simply learn
                    knowledge as a part of you training scheme. There are many different approaches that have been done
                    to solve this problem
                </p>
                <p>
                    The most common approach to extract knowledge from a language model like BERT is to create the
                    knowledge embeddings directly from the model. One way this has been done is by directly taking the
                    embeddings learned by BERT as a baseline set of embeddings for you model. The notion is that these
                    embeddings geometrically encode the entity relationships learned by BERT, and so augmenting your
                    model with these embeddings should inject this knowledge into you model. A similar method to build
                    embeddings from BERT was to take the output of the last internal layer of BERT as embeddings. The
                    reasoning behind this approach is that the last internal output is the representation of the
                    original sentence that BERT believed most optimal for its task. Thus the thought is that this
                    representation should encode the knowledge that BERT uses for its modeling last. Both of these
                    approaches showed less than optimal results on real translation task.
                </p>
                <p>
                    Another approach that has been used is using BERT as a tool to detect errors in translation. This
                    approach relies on the use of BERT as language model to power a sentence evaluator. One thing that
                    this means though is that this strategy requires a model like BERT for the target language. This
                    approach saw some success, but is only tangentially related to the use of BERT's knowledge.
                </p>
                <p>
                    Finally, similar to the discussion of language models being used to fill in entity relationships
                    instead of knowledge bases, approaches have been used BERT as a knowledge graph directly. These
                    approaches are difficult as it is difficult to extract knowledge from BERT in a manner immediately
                    useful for a translation task. This approach is more tied to general methods that extract knowledge graphs
                    from language models than it is to translation tasks specifically. These approaches are still in
                    there infancy, but could result in ways to build learn knowledge graphs in the future.
                </p>
                <h5>When They Work</h5>
                <p>
                    Knowledge graphs have seem success, but are not used in most state-of-the-art approaches. There are
                    particular cases that knowledge graphs excel in that we will outline here. The first of which is
                    cases where data is limited. Many of the best approaches use done by massive companies with large
                    data and compute capabilities. Most engineers don't have access to Google levels of data to train
                    their models. In these cases, injecting knowledge via knowledge graphs has had success. This is
                    because, with minimal data and parameters, the model can not learn to encode all the knowledge it
                    needs alone. Thus, augmenting the model with external knowledge can help alleviate these problems.
                </p>
                <p>
                    Another case where knowledge graphs excel is pairing of languages with different entity
                    relationship structures. This is why many knowledge graph papers tackle and excel on translation
                    tasks from English to Japanese, and not as much on English to German. Translating from English to
                    German is a much simpler tasks and the closeness of the two languages means issues of rare pairings
                    decreases. In English-Japanese translations, the translator must also learn handle the issue that
                    English and Japanese do not often compatible ways to express the same thing. In these cases,
                    injecting expert knowledge can offset relationships the model fails to learn on its own.
                </p>
                <p>
                    On case where knowledge graphs truly excel in is in translating domain specific text. For example,
                    translating medical documents or legal paperwork. In these cases we can inject expert knowledge for
                    the particular domain into a existing translator to improve its performance in a particular field.
                    These cases also best represent the notion of knowledge, as they rely on expert knowledge to
                    translate.
                </p>
                <h3>The Future</h3>
                {/*<h5>Considering when to use machine translation</h5>*/}
                {/*<p>*/}
                {/*    Machine translation is being used across many application due to the need for massive*/}
                {/*    translation. There are many times were machine translation makes sense, for example large*/}
                {/*    amounts of user data that would be otherwise inaccessible. In these cases, translation failure*/}
                {/*    may result in some confusion but there is an expectation that YouTube subtitles may fail, so there*/}
                {/*    the failure cost is low. Anytime there is a large cost for a mistake, machine translation alone is*/}
                {/*    probably not the way to go.*/}
                {/*</p>*/}
                <h5>State of the Translation Supply Chain</h5>
                <p>
                    Machine translation is being used across many application due to the need for massive amounts of
                    translation. A state of the translation supply chain report covered the thoughts of translators on
                    machine translation.
                </p>
                <figure style={{width: '70%'}}>
                    <img src={"translation_survey.png"} alt={"Translator survey results about thoughts on machine translation"}/>
                    <figcaption>
                        <div className={'credit'}>
                            [17] “The State of the Linguist Supply Chain”
                        </div>
                        <div className={'description'}>
                            Fig.9 - Translator survey results about thoughts on machine translation
                        </div>
                    </figcaption>
                </figure>
                <p>
                    It seems that in its current state, machine translation is better appreciated as a tool to assist in
                    translation. Quality does not seem to have gotten to a point to simply rely on machine translation
                    for all tasks. This may suggest a move to systems to assist in translation instead of completely
                    automating it.
                </p>
                <h5>Data situation</h5>
                <p>
                    Another future for machine translation is the building of better datasets for the task. Currently
                    there exist many datasets for popular language pairs like English-Spanish and English-German, but
                    public dataset of sufficient size for other language pairs is limited. This issue will hopefully be
                    solved as more data becomes available over time.
                </p>
                <h5>Future Use of Knowledge Graphs</h5>
                <p>
                    There are some issues that stand in the way of the use of structured knowledge graphs in machine
                    translation at a large scale. One of the large ones is that many of the methods that exploit
                    structured knowledge in translation tasks require a knowledge graph for both languages being paired,
                    and the others require that you have a knowledge graph for the source language. This is an issue as
                    the smaller data languages that knowledge graphs would excel in generally lack knowledge graphs. So,
                    one thing that would be needed is the creation of large knowledge graphs for more languages.
                </p>
                <p>
                    The creation of these knowledge graphs is an expensive operation, which is often done by hand.
                    Instead, extracting the knowledge in language models such as BERT would allow for the creation of
                    large or specific knowledge bases very quickly. This is why determining a way to extract useful
                    knowledge from BERT seems like the current goal. If this is completed, then augmenting a model with
                    knowledge becomes as simple as either adding pretrained knowledge to your model or simply adding
                    this learning step to your current pipeline.
                </p>
                <h3>References</h3>

                <p className={'bib-entry'}>
                    [1] T. Luong, I. Sutskever, Q. Le, O. Vinyals, and W. Zaremba, “Addressing the Rare Word Problem in Neural Machine Translation,” in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China, 2015, pp. 11–19, doi: 10.3115/v1/P15-1002.
                </p>
                <p className={'bib-entry'}>
                    [2] D. Liang et al., “BERT Enhanced Neural Machine Translation and Sequence Tagging Model for Chinese Grammatical Error Diagnosis,” in Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications, Suzhou, China, Dec. 2020, pp. 57–66, [Online]. Available: https://www.aclweb.org/anthology/2020.nlptea-1.8.
                </p>
                <p className={'bib-entry'}>
                    [3] “Browse a list of Google’s Office Locations - Google.” //www.google.com/locations/ (accessed Mar. 06, 2021).
                </p>
                <p className={'bib-entry'}>
                    [4] I. Caswell and B. Liang, “Recent Advances in Google Translate,” Google AI Blog. http://ai.googleblog.com/2020/06/recent-advances-in-google-translate.html (accessed Mar. 06, 2021).
                </p>
                <p className={'bib-entry'}>
                    [5] “Catastrophe in Marketing Translation,” Brightlines Translation, Feb. 08, 2019. https://www.brightlines.co.uk/translation-services/marketing/catastrophe-in-marketing-translation-and-how-to-avoid-it/ (accessed Mar. 06, 2021).
                </p>
                <p className={'bib-entry'}>
                    [6] Y. Lu, J. Zhang, and C. Zong, “Exploiting Knowledge Graph in Neural Machine Translation,” in Machine Translation, Singapore, 2019, pp. 27–38, doi: 10.1007/978-981-13-3083-4_3.
                </p>
                <p className={'bib-entry'}>
                    [7] “Google Translate.” https://translate.google.com/ (accessed Mar. 06, 2021).
                </p>
                <p className={'bib-entry'}>
                    [8] J. Zhu et al., “Incorporating BERT into Neural Machine Translation,” arXiv:2002.06823 [cs], Feb. 2020, Accessed: Mar. 06, 2021. [Online]. Available: http://arxiv.org/abs/2002.06823.
                </p>
                <p className={'bib-entry'}>
                    [9] “Insights on the Machine Translation Market 2020-2024: COVID-19 Industry Analysis, Drivers, Restraints, Opportunities, and Threats - Technavio,” AP NEWS, Nov. 10, 2020. https://apnews.com/press-release/business-wire/technology-north-america-58a4bf89f5cb4647839caf19ced9eebf (accessed Mar. 06, 2021).
                </p>
                <p className={'bib-entry'}>
                    [10] B. Ahmadnia, B. J. Dorr, and P. Kordjamshidi, “Knowledge Graphs Effectiveness in Neural Machine Translation Improvement,” csci, vol. 21, no. 3, Sep. 2020, doi: 10.7494/csci.2020.21.3.3701.
                </p>
                <p className={'bib-entry'}>
                    [11] Y. Zhao, J. Zhang, Y. Zhou, and C. Zong, “Knowledge Graphs Enhanced Neural Machine Translation,” in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, Yokohama, Japan, Jul. 2020, pp. 4039–4045, doi: 10.24963/ijcai.2020/559.
                </p>
                <p className={'bib-entry'}>
                    [12] C. Wang, X. Liu, and D. Song, “Language Models are Open Knowledge Graphs,” arXiv:2010.11967 [cs], Oct. 2020, Accessed: Mar. 06, 2021. [Online]. Available: http://arxiv.org/abs/2010.11967.
                </p>
                <p className={'bib-entry'}>
                    [13] F. Petroni et al., “Language Models as Knowledge Bases?,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 2019, pp. 2463–2473, doi: 10.18653/v1/D19-1250.
                </p>
                <p className={'bib-entry'}>
                    [14] “Wikipedia:Modelling Wikipedia’s growth,” Wikipedia. Dec. 13, 2020, Accessed: Mar. 06, 2021. [Online]. Available: https://en.wikipedia.org/w/index.php?title=Wikipedia:Modelling_Wikipedia%27s_growth&oldid=993918468.
                </p>
                <p className={'bib-entry'}>
                    [15] S. Clinchant, K. W. Jung, and V. Nikoulina, “On the use of BERT for Neural Machine Translation,” in Proceedings of the 3rd Workshop on Neural Generation and Translation, Hong Kong, 2019, pp. 108–117, doi: 10.18653/v1/D19-5611.
                </p>
                <p className={'bib-entry'}>
                    [16] “YouTube: hours of video uploaded every minute 2019,” Statista. https://www.statista.com/statistics/259477/hours-of-video-uploaded-to-youtube-every-minute/ (accessed Mar. 06, 2021).
                </p>
                <p className={'bib-entry'}>
                    [17] H. Pielmeier and P. O’Mara, “The State of the Linguist Supply Chain.” CSA Research, Jan. 2020, [Online]. Available: https://cdn2.hubspot.net/hubfs/4041721/Newsletter/The%20State%20of%20the%20Linguist%20Supply%20Chain%202020.pdf.
                </p>
                <p className={'bib-entry'}>
                    [18] D. Moussallem, A.-C. Ngonga Ngomo, P. Buitelaar, and M. Arcan, “Utilizing Knowledge Graphs for Neural Machine Translation Augmentation,” in Proceedings of the 10th International Conference on Knowledge Capture, Marina Del Rey CA USA, Sep. 2019, pp. 139–146, doi: 10.1145/3360901.3364423.
                </p>
                <p className={'bib-entry'}>
                    [19] A. Vaswani et al., “Attention Is All You Need,” arXiv:1706.03762 [cs], Dec. 2017, Accessed: Mar. 06, 2021. [Online]. Available: http://arxiv.org/abs/1706.03762.
                </p>
            </Row>
        </Container>
    );
}