A named entity marking apparatus, a named entity marking method, and a computer program product thereof are provided. The named entity marking apparatus comprises a processor and a storage unit, wherein the processor is electrically connected to the storage unit. The storage unit is stored with an electronic document and a named entity database. The processor marks the electronic document into a first marked document by a first set of the named entity database. The processor decides a second set of the named entity database according to the first marked document. The processor re-marks the electronic document into a second marked document by the second set of the named entity database.
A named entity marking apparatus, a named entity marking method, and a computer program product thereof are provided. The named entity marking apparatus comprises a processor and a storage unit, wherein the processor is electrically connected to the storage unit. The storage unit is stored with an electronic document and a named entity database. The processor marks the electronic document into a first marked document by a first set of the named entity database. The processor decides a second set of the named entity database according to the first marked document. The processor re-marks the electronic document into a second marked document by the second set of the named entity database.
The goal is to meet the needs of the government, enterprises and consumers to observe Social Networks and develop smart and automatic solutions for observation of Social Networks.
在網路廣告產業生態系統中,需求方平台(Demand Side Platforms,DSPs)在最近幾年剛剛興起,未來發展潛力巨大,因為中小企業普遍沒有能力支付代理商費用,需求方平台則能透過自助式服務幫助中小企業進行媒體精準投放和優化組合。此現象在美國已逐漸成為趨勢,因為美國中小企業的基數龐大,而他們的需求直接促使了需求方平台的興起,但需求方平台要以媒體開放數據為基礎。
技術現況敘述-英文: The goal is to meet the needs of the government, enterprises and consumers to observe Social Networks and develop smart and automatic solutions for observation of Social Networks.
潛力預估: 在網路廣告產業生態系統中,需求方平台(Demand Side Platforms,DSPs)在最近幾年剛剛興起,未來發展潛力巨大,因為中小企業普遍沒有能力支付代理商費用,需求方平台則能透過自助式服務幫助中小企業進行媒體精準投放和優化組合。此現象在美國已逐漸成為趨勢,因為美國中小企業的基數龐大,而他們的需求直接促使了需求方平台的興起,但需求方平台要以媒體開放數據為基礎。
When the Web Crawler captures big data from social communities, it is easy to cause idleness owing to various external factors (blocking times). In order to solve the problem, the Web Crawler must reduce idleness in its operating schedule. This technology introduces the "decentralized dynamic deployment and scheduling mechanism" and the "anchoring and adjustment theory" for financial management, so that the schedules and numbers of the Web Crawler can be dynamically adjusted based on external factors. In addition, according to the previous analysis (such as: semantic analysis, natural language processing, etc.), words are classified based on different characteristics in attributes instead of consumer recommendations or emotional words. This technology collects consumer recommendations and emotional corpus from the platform for electronic business operators to automatically establish the Concept Space thesaurus and hopes that this will provide services for community media customers to analyze the reputation of certain commodities.
技術現況敘述-英文: When the Web Crawler captures big data from social communities, it is easy to cause idleness owing to various external factors (blocking times). In order to solve the problem, the Web Crawler must reduce idleness in its operating schedule. This technology introduces the "decentralized dynamic deployment and scheduling mechanism" and the "anchoring and adjustment theory" for financial management, so that the schedules and numbers of the Web Crawler can be dynamically adjusted based on external factors. In addition, according to the previous analysis (such as: semantic analysis, natural language processing, etc.), words are classified based on different characteristics in attributes instead of consumer recommendations or emotional words. This technology collects consumer recommendations and emotional corpus from the platform for electronic business operators to automatically establish the Concept Space thesaurus and hopes that this will provide services for community media customers to analyze the reputation of certain commodities.
A system for generating a ‘snapshot’ of a learning object is provided. An interface receives a target object and a user identification number. The target object corresponds to a category, comprising a plurality of sentences and multimedia data, wherein the sentences comprise at least one keyword. A learning object database comprises a plurality of learning objects and a user's historical learning record. Each of the learning objects corresponds to at least one category, and comprises at least one keyword. The user's historical learning record comprises a track record of learning objects used corresponding to the user identification number. A script preview unit selects at least one of the sentences of the target object according to the user's historical learning record corresponding to the user identification number. A multimedia preview unit selects one of the multimedia data of the target object, wherein the selected multimedia data is highly related to the selected sentence. A ‘snapshot’ generator generates a ‘snapshot’ of the target object by combining the selected sentence and the selected multimedia data, and directs a display device to display the ‘snapshot’.
技術摘要-英文: A system for generating a ‘snapshot’ of a learning object is provided. An interface receives a target object and a user identification number. The target object corresponds to a category, comprising a plurality of sentences and multimedia data, wherein the sentences comprise at least one keyword. A learning object database comprises a plurality of learning objects and a user's historical learning record. Each of the learning objects corresponds to at least one category, and comprises at least one keyword. The user's historical learning record comprises a track record of learning objects used corresponding to the user identification number. A script preview unit selects at least one of the sentences of the target object according to the user's historical learning record corresponding to the user identification number. A multimedia preview unit selects one of the multimedia data of the target object, wherein the selected multimedia data is highly related to the selected sentence. A ‘snapshot’ generator generates a ‘snapshot’ of the target object by combining the selected sentence and the selected multimedia data, and directs a display device to display the ‘snapshot’.
The invention discloses an error-detecting method for a Chinese article, handling a Chinese sentence including a first erroneous Chinese character string in a first location. The method includes subdividing the first erroneous Chinese character string into a plurality of first subgroups, wherein each of the first subgroups consists of two consecutive and non-consecutive Chinese characters out of the first erroneous Chinese character string. The method further includes providing a database containing a plurality of first correct Chinese character strings and a plurality of corresponding first correct indices, wherein the first correct indices consist of two consecutive and non-consecutive Chinese characters out of the first correct Chinese character strings. The method further includes acquiring one of the first correct indices according to the first subgroup, and one of the first correct Chinese character strings according to the acquired first correct index. The method further includes generating a best candidate sentence according to the acquired first correct Chinese character string, and showing the Chinese sentence and the best candidate sentence on a display device.
技術摘要-英文: The invention discloses an error-detecting method for a Chinese article, handling a Chinese sentence including a first erroneous Chinese character string in a first location. The method includes subdividing the first erroneous Chinese character string into a plurality of first subgroups, wherein each of the first subgroups consists of two consecutive and non-consecutive Chinese characters out of the first erroneous Chinese character string. The method further includes providing a database containing a plurality of first correct Chinese character strings and a plurality of corresponding first correct indices, wherein the first correct indices consist of two consecutive and non-consecutive Chinese characters out of the first correct Chinese character strings. The method further includes acquiring one of the first correct indices according to the first subgroup, and one of the first correct Chinese character strings according to the acquired first correct index. The method further includes generating a best candidate sentence according to the acquired first correct Chinese character string, and showing the Chinese sentence and the best candidate sentence on a display device.
The invention discloses an error-detecting method for a Chinese article, handling a Chinese sentence including a first erroneous Chinese character string in a first location. The method includes subdividing the first erroneous Chinese character string into a plurality of first subgroups, wherein each of the first subgroups consists of two consecutive and non-consecutive Chinese characters out of the first erroneous Chinese character string. The method further includes providing a database containing a plurality of first correct Chinese character strings and a plurality of corresponding first correct indices, wherein the first correct indices consist of two consecutive and non-consecutive Chinese characters out of the first correct Chinese character strings. The method further includes acquiring one of the first correct indices according to the first subgroup, and one of the first correct Chinese character strings according to the acquired first correct index. The method further includes generating a best candidate sentence according to the acquired first correct Chinese character string, and showing the Chinese sentence and the best candidate sentence on a display device.
技術摘要-英文: The invention discloses an error-detecting method for a Chinese article, handling a Chinese sentence including a first erroneous Chinese character string in a first location. The method includes subdividing the first erroneous Chinese character string into a plurality of first subgroups, wherein each of the first subgroups consists of two consecutive and non-consecutive Chinese characters out of the first erroneous Chinese character string. The method further includes providing a database containing a plurality of first correct Chinese character strings and a plurality of corresponding first correct indices, wherein the first correct indices consist of two consecutive and non-consecutive Chinese characters out of the first correct Chinese character strings. The method further includes acquiring one of the first correct indices according to the first subgroup, and one of the first correct Chinese character strings according to the acquired first correct index. The method further includes generating a best candidate sentence according to the acquired first correct Chinese character string, and showing the Chinese sentence and the best candidate sentence on a display device.
A method of generating and detecting confusing phones/syllables is disclosed. The method includes a generating stage and a detecting stage. The generating stage includes: (a) input a Mandarin utterance; (b) partition the Mandarin utterance into segmented phones/syllables and generate the most likely route in a recognition net via Forced Alignment of Viterbi decoding; (c) compare the segmented phones/syllables with a Mandarin acoustic model; (d) determine whether a confusing phone/syllable exists; (e) add the confusing phone/syllable into the recognition net and repeat step (b), (c), and (d) when the confusing phone/syllable exists; (f) stop and output all generated confusing phones/syllables to a confusing phone/syllable file when a confusing phone/syllable does not exist. The detecting stage includes: (g) input a spoken sentence; (h) align the spoken sentence with the recognition net; (i) determine the most likely route of the spoken sentence; and (j) compare the most likely route of the spoken sentence with the target route of the spoken sentence to detect pronunciation error and give high-level pronunciation suggestions.
技術摘要-英文: A method of generating and detecting confusing phones/syllables is disclosed. The method includes a generating stage and a detecting stage. The generating stage includes: (a) input a Mandarin utterance; (b) partition the Mandarin utterance into segmented phones/syllables and generate the most likely route in a recognition net via Forced Alignment of Viterbi decoding; (c) compare the segmented phones/syllables with a Mandarin acoustic model; (d) determine whether a confusing phone/syllable exists; (e) add the confusing phone/syllable into the recognition net and repeat step (b), (c), and (d) when the confusing phone/syllable exists; (f) stop and output all generated confusing phones/syllables to a confusing phone/syllable file when a confusing phone/syllable does not exist. The detecting stage includes: (g) input a spoken sentence; (h) align the spoken sentence with the recognition net; (i) determine the most likely route of the spoken sentence; and (j) compare the most likely route of the spoken sentence with the target route of the spoken sentence to detect pronunciation error and give high-level pronunciation suggestions.
A method of generating and detecting confusing phones/syllables is disclosed. The method includes a generating stage and a detecting stage. The generating stage includes: (a) input a Mandarin utterance; (b) partition the Mandarin utterance into segmented phones/syllables and generate the most likely route in a recognition net via Forced Alignment of Viterbi decoding; (c) compare the segmented phones/syllables with a Mandarin acoustic model; (d) determine whether a confusing phone/syllable exists; (e) add the confusing phone/syllable into the recognition net and repeat step (b), (c), and (d) when the confusing phone/syllable exists; (f) stop and output all generated confusing phones/syllables to a confusing phone/syllable file when a confusing phone/syllable does not exist. The detecting stage includes: (g) input a spoken sentence; (h) align the spoken sentence with the recognition net; (i) determine the most likely route of the spoken sentence; and (j) compare the most likely route of the spoken sentence with the target route of the spoken sentence to detect pronunciation error and give high-level pronunciation suggestions.
技術摘要-英文: A method of generating and detecting confusing phones/syllables is disclosed. The method includes a generating stage and a detecting stage. The generating stage includes: (a) input a Mandarin utterance; (b) partition the Mandarin utterance into segmented phones/syllables and generate the most likely route in a recognition net via Forced Alignment of Viterbi decoding; (c) compare the segmented phones/syllables with a Mandarin acoustic model; (d) determine whether a confusing phone/syllable exists; (e) add the confusing phone/syllable into the recognition net and repeat step (b), (c), and (d) when the confusing phone/syllable exists; (f) stop and output all generated confusing phones/syllables to a confusing phone/syllable file when a confusing phone/syllable does not exist. The detecting stage includes: (g) input a spoken sentence; (h) align the spoken sentence with the recognition net; (i) determine the most likely route of the spoken sentence; and (j) compare the most likely route of the spoken sentence with the target route of the spoken sentence to detect pronunciation error and give high-level pronunciation suggestions.
A system for generating a ‘snapshot’ of a learning object is provided. An interface receives a target object and a user identification number. The target object corresponds to a category, comprising a plurality of sentences and multimedia data, wherein the sentences comprise at least one keyword. A learning object database comprises a plurality of learning objects and a user's historical learning record. Each of the learning objects corresponds to at least one category, and comprises at least one keyword. The user's historical learning record comprises a track record of learning objects used corresponding to the user identification number. A script preview unit selects at least one of the sentences of the target object according to the user's historical learning record corresponding to the user identification number. A multimedia preview unit selects one of the multimedia data of the target object, wherein the selected multimedia data is highly related to the selected sentence. A ‘snapshot’ generator generates a ‘snapshot’ of the target object by combining the selected sentence and the selected multimedia data, and directs a display device to display the ‘snapshot’.
技術摘要-英文: A system for generating a ‘snapshot’ of a learning object is provided. An interface receives a target object and a user identification number. The target object corresponds to a category, comprising a plurality of sentences and multimedia data, wherein the sentences comprise at least one keyword. A learning object database comprises a plurality of learning objects and a user's historical learning record. Each of the learning objects corresponds to at least one category, and comprises at least one keyword. The user's historical learning record comprises a track record of learning objects used corresponding to the user identification number. A script preview unit selects at least one of the sentences of the target object according to the user's historical learning record corresponding to the user identification number. A multimedia preview unit selects one of the multimedia data of the target object, wherein the selected multimedia data is highly related to the selected sentence. A ‘snapshot’ generator generates a ‘snapshot’ of the target object by combining the selected sentence and the selected multimedia data, and directs a display device to display the ‘snapshot’.