Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
M
Major Project Handcrafted
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Boards
Labels
Milestones
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Commits
Issue Boards
Open sidebar
jonathan.poalses
Major Project Handcrafted
Commits
a9ba2fdc
Commit
a9ba2fdc
authored
May 02, 2023
by
Jonathan Poalses
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Cleanup
parent
3c59b3d4
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
0 additions
and
19 deletions
+0
-19
dialect_nlp.clj
src/poalses/jonathan/dialect/dialect_nlp.clj
+0
-19
No files found.
src/poalses/jonathan/dialect/dialect_nlp.clj
View file @
a9ba2fdc
...
...
@@ -42,9 +42,6 @@
"ner"
]
:quote
{
:extractUnclosedQuotes
"true"
}}))
(
def
bad-words
#
{
"why"
"cause"
})
;; Word sets that will show a sentence as being of that dialect
(
def
australian-words
#
{
"incorrect"
"why"
})
...
...
@@ -73,22 +70,6 @@
(
if
(
empty?
dialects
)
[
:standard
]
dialects
)))
;; Another failed attempt
;(defn detect-sentence-dialect [sentence]
; (let [dialects []
; tokens (dl/tokens sentence)]
; (when (some australian-words (dl/text (dl/tokens tokens)))
; (let [dialects (conj dialects :australian)]
; (when (some scottish-words (dl/text (dl/tokens tokens)))
; (let [dialects (conj dialects :scottish)]
; (when (some american-words (dl/text (dl/tokens tokens)))
; (let [ dialects (conj dialects :american)]
; (if (empty? dialects) (conj dialects :standard))
; dialects))))))))
;; Take a text sample and separate it into its sentences, then for each sentence find its dialects, and return the most common dialect
;; A sentence can have an indeterminate number of dialects associated with it, as detect-sentence-dialects can return a collection,
;;when no dialect can be detected it defaults to standard. (IE if there's a sample with 3 sentences, one reads as scottish,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment