[LeetCode] 819. 가장 흔한 단어(Most Common Word)

250x250

Notice

Recent Posts

Recent Comments

Link

« 2025/08 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Tags more

Archives

Today

Total

관리 메뉴

멋진 개발자가 되고 싶다

[LeetCode] 819. 가장 흔한 단어(Most Common Word) 본문

Algorithm Study/leetcode

[LeetCode] 819. 가장 흔한 단어(Most Common Word)

오패산개구리 2021. 6. 25. 09:27

728x90

금지된 단어를 제외한 가장 흔하게 등장하는 단어를 출력하라.

대소문자 구분을 하지 않으며, 구두점(마침표, 쉼표 등) 또한 무시한다.

Input: paragraph = "Bob hit a ball, the hit BALL flew far after it was hit.", banned = ["hit"]

Output: "ball"

내가 직접 푼 코드

1
2
3
4
5
6
7
8

class Solution:
    def mostCommonWord(self, paragraph: str, banned: List[str]) -> str:
        string = paragraph.lower()
        string = re.sub('[\'?!;",.]',' ',string) # 특수문자 제거
        str_list = string.split()
        str_list = [word for word in str_list if word not in banned] # ban된거 제거
        str_list = collections.Counter(str_list)
        return str_list.most_common(1)[0][0]
Colored by Color Scripter

cs

해설 :

우선 paragraph를 소문자로 바꿔주고 특수문자를 제거해준다.

전에 봤던 re.sub는 아주 유용하게 쓰인다.

단, 특수문자를 입력할 때 '의 경우 그대로 입력하면 ' '과 겹치게 되어 에러가 발생하는데

이때 \' 이런 식으로 입력해주면 된다.

그 후 ban 된 것을 없애주기 위해 한 줄로 표현해봤다.

마지막으로 유용하기 그지없는 collections.Counter와 most_common을 이용하여 가장 빈도가 높은 단어를 뽑아낸다.

** 깔끔 답지 **

1. 리스트 컴프리헨션, Counter 객체 사용

1
2
3
4
5
6
7
8
9

class Solution:
    def mostCommonWord(self, paragraph: str, banned: List[str]) -> str:
        words = [word for word in re.sub(r'[^\w]',' ',paragraph)
        .lower().split() 
             if word not in banned]
    
        counts = collections.Counter(words)
 
        return counts.most_common(1)[0][0]

cs

해설 :

나의 코드와 다른 점은 데이터 클렌징(Data Cleansing) 과정이 한 줄로 표현되었다는 것.

그리고 정규표현식이 깔끔하다.

우선 \n은 정규식에서 단어 문자를 뜻하고 ^는 not을 의미한다.

따라서 '단어 문자가 아닌 것'을 내포한다.

여기서 r이 왜 쓰였는지 궁금했다.

알아보니 r은 raw string이란 의미로 만약 \를 문자 그대로 사용하고 싶다면 r\ 이런 식으로 쓰면 된다.

하지만 위에선 왜 저기에 r이 들어있는지는 모르겠다(아시는 분 있으면 댓글 주세요~).

그 뒤로는 내 방식과 같다.

출처 : 파이썬 알고리즘 인터뷰 (글 : 박상길 그림 : 정진호) [책만]

728x90

'Algorithm Study > leetcode' 카테고리의 다른 글

[LeetCode] 5. 가장 긴 팰린드롬 부분 문자열(Longest Palindrom Substring) (0)	2021.06.26
[LeetCode] 49. 그룹 애너그램(Group Anagrams) (0)	2021.06.26
[LeetCode] 937. 로그 파일 재정렬(Reorder Log Files) (0)	2021.06.25
[LeetCode] 344. 문자열 뒤집기( Reverse String) (0)	2021.06.24
[125번]유효한 팰린드롬(Valid-palindrome) (0)	2021.06.24

'Algorithm Study/leetcode' Related Articles

멋진 개발자가 되고 싶다

[LeetCode] 819. 가장 흔한 단어(Most Common Word) 본문

[LeetCode] 819. 가장 흔한 단어(Most Common Word)

'Algorithm Study > leetcode' 카테고리의 다른 글

티스토리툴바